Search Results: "rla"

14 April 2023

John Goerzen: Easily Accessing All Your Stuff with a Zero-Trust Mesh VPN

Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there. But these days, the notion of corporate network and home network are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home. And, in both scenarios, there might be questions about giving limited access to friends. Perhaps you d like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project. Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you can t even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for. Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably no . We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all. So this page is fundamentally about making the network work, simply and effectively.

How do we make the Internet work in these scenarios? We re going to combine three concepts:
  1. A VPN, providing fully encrypted and authenticated communication and stable IPs
  2. Mesh Networking, in which devices automatically discover optimal paths to reach each other
  3. Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.
By combining these concepts, we arrive at some nice results:
  • You can ssh hostname, where hostname is one of your machines (server, laptop, whatever), and as long as hostname is up, you can reach it, wherever it is, wherever you are.
    • Combined with mosh, these sessions will be durable even across moving to other host networks.
    • You could just as well use telnet, because the underlying network should be secure.
  • You don t have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, that s all you need. hosts.allow could make a comeback!
  • You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.
There might sometimes be tradeoffs. For instance:
  • On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.
  • With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they can t reach the control plane over the Internet (Internet is down or provider is down)
Some other features that some of the tools provide include:
  • Easy sharing of limited access with friends/guests
  • Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet
  • Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.
Let s dive in.

Types of Mesh VPNs I ll go over several types of meshes in this article:
  1. Fully decentralized with automatic hop routing This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples I ll cover include Yggdrasil and tinc.
  2. Automatic peer-to-peer with centralized control In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples I ll cover include Tailscale, Zerotier, Nebula, and Netmaker.
  3. Roll your own and hybrid approaches This is a grab bag of other ideas; for instance, running Yggdrasil over Tailscale.

Terminology For the sake of consistency, I m going to use common language to discuss things that have different terms in different ecosystems:
  • Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. I ll call such a relay a broker . This may or may not be the same system that is a control plane for a tool.
  • All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). I m going to call the unencrypted lower layer, whatever it is, the clearnet .

Evaluation Criteria Here are the things I want to see from a solution:
  • Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.
  • Flexible, adapting to changes in network topology quickly and automatically.
  • Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.
  • Private, minimizing leakage of information or metadata about me and my systems
  • Able to traverse CGNAT without having to use a broker whenever possible
  • A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.
  • Fully or nearly fully Open Source
  • Free or very cheap for personal use
  • Wide operating system support, including headless Linux on x86_64 and ARM.

Fully Decentralized VPNs with Automatic Hop Routing Two systems fit this description: Yggdrasil and Tinc. Let s dive in.

Yggdrasil I ll start with Yggdrasil because I ve written so much about it already. It featured in prior posts such as:

Yggdrasil can be a private mesh VPN, or something more Yggdrasil can be a private mesh VPN, just like the other tools covered here. It s unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate. Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away. Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesn t. The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Let s sit with that concept for a moment. Consider:
  • If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, that s fine; all the other machines will discover this route to the world and use it when necessary.
  • All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.
  • It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, it s not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when it s asked to.
  • Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.
Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.) One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is special , in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.

Meeting neighbors There are two ways that Yggdrasil knows about peers:
  • By broadcast discovery on the local LAN
  • By listening on a specific port (or being told to connect to a specific host/port)
Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.

Unique uses Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices perhaps ones that had never known of each other before that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it. Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesn t require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you don t need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.

Yggdrasil: Security and Privacy Yggdrasil s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously. I ll discuss firewalling more at the end of this article. Basically, you ll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device. If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty MulticastInterfaces section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodes listening interfaces, which you ll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, setting AllowedPublicKeys as appropriate). Yggdrasil doesn t allow filtering multicast clients by public key, only by network interface, so that s why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal gateway node or two, that the clients just connect to when they re local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only. Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh but that list is the public keys of the directly-connected peers, not the clearnet IPs. Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.

Yggdrasil: Connectivity and NAT traversal Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance. Yggdrasil s underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP it s not, and it is even surprisingly good around bufferbloat. I ve found its performance to be on par with the other tools here, and it works as well as I d expect even on flaky 4G links. Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 and Yggdrasil can even make it speak TLS (even though that s unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones). If you connect your system to multiple public peers, there is a chance though a very small one that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with yggdrasilctl debug_remotegetpeers key=ABC...). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.

Yggdrasil: Sharing with friends If you re open to participating in the public mesh, this is one of the easiest things of all. Have your friend install Yggdrasil, point them to a public peer, give them your Yggdrasil IP, and that s it. (Well, presumably you also open up your firewall you did follow my advice to set one up, right?) If your friend is visiting at your location, they can just hop on your wifi, install Yggdrasil, and it will automatically discover a route to you. Yggdrasil even has a zero-config mode for ephemeral nodes such as certain Docker containers. Yggdrasil doesn t directly support publishing to the clearnet, but it is certainly possible to proxy (or even NAT) to/from the clearnet, and people do.

Yggdrasil: DNS There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to /etc/hosts and leave it at that, but it s up to you.

Yggdrasil: Source code, pricing, and portability Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made). There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodes keys). A peer in this case is just a node with a known clearnet IP address. Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.

Yggdrasil conclusions Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down. The unique added feature of being able to be part of a global mesh is a nice one. The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesn t, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish). There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil. Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.

tinc tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to an older Yggdrasil without the public option. I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022. Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small. To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a host s public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy. tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesn t make sense for a laptop, and I suspect you d just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP. Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh. Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasn t as thorough as that of Yggdrasil.

tinc: Security and Privacy As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another. It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.

tinc: Connectivity and NAT traversal I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.

tinc: Sharing with friends tinc has no special support for this, and the difficulty of configuration makes it unlikely you d do this with tinc.

tinc: Source code, pricing, and portability tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy. tinc does not seem to be very actively maintained.

tinc conclusions I haven t mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. That s 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and I m not going to be using tinc.

Automatic Peer-to-Peer Mesh VPNs with centralized control These tend to be the options that are frequently discussed. Let s talk about the options.

Tailscale Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on tailscale.com. Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh ( tailnet ). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs. When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesn t have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker. Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard. Tailscale has a number of somewhat unique features in this space:
  • Funnel, which lets you expose ports on your system to the public Internet via the VPN.
  • Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.
  • Node sharing, which lets you share a subset of your network with guests
  • A fantastic set of documentation, easily the best of the bunch.
Funnel, in particular, is interesting. With a couple of tailscale serve -style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so it s not really a production solution but as a quick and easy way to demonstrate something cool to a friend, it s a neat feature.

Tailscale: Security and Privacy With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscale s control plane, or of the credentials you use to access it? Let s begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals aren t wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login. If a person does compromise your account or the Tailscale servers themselves, they can t directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:
  • Tamper with your Tailscale ACLs, permitting new actions
  • Add new nodes to the network
  • Forcibly remove nodes from the network
  • Enable or disable optional features
Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are. Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below. Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesn t address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it. Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane. Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when it s most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.

Tailscale: Connectivity and NAT traversal When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is a fairly brittle setup and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that won t work unless your Internet is up and the control plane is reachable. Assuming your Internet is up and Tailscale s infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here. Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. I m not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker. I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, it s possible (maybe even probable) that if I d waited a bit longer, it would have sorted this all out.

Tailscale: Sharing with friends I touched on the funnel feature earlier. The sharing feature lets you give an invite to an outsider. By default, a person accepting a share can make only outgoing connections to the network they re invited to, and cannot receive incoming connections from that network this makes sense. When sharing an exit node, you get a checkbox that lets you share access to the exit node as well. Of course, the person accepting the share needs to install the Tailnet client. The combination of funnel and sharing make Tailscale the best for ad-hoc sharing.

Tailscale: DNS Tailscale s DNS is called MagicDNS. It runs as a layer atop your standard DNS taking over /etc/resolv.conf on Linux and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick. It also is a bit flaky on Linux; dueling programs want to write to /etc/resolv.conf. I can t really say this is entirely Tailscale s fault; they document the problem and some workarounds. I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, that s not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.

Tailscale: Source code, pricing, and portability Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscale s blessing and informal support. It supports most, but not all, features in the Tailscale coordination server. Tailscale s pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year - not a bad deal at $4/mo. A Community on Github plan also exists, and then there are more business-oriented plans as well. See the pricing page for details. As a small note, I appreciated Tailscale s install script. It properly added Tailscale s apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.

Tailscale conclusions Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.

Zerotier Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, I m mainly going to describe how it differs from Tailscale. The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface. However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldn t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, I m not really convinced there s much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow. A somewhat obscure feature, but possibly useful, is Zerotier s built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.

Zerotier: Security and Privacy The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale - notably I couldn t find anything about how sticky an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. I m not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock. For this reason, I didn t proceed much further in my Zerotier evaluation.

Zerotier: Connectivity and NAT traversal Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like it s more limited than Tailscale s, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers ( root servers ) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.

Zerotier: Sharing with friends I was unable to find any special features relating to this in the Zerotier documentation. Therefore, it would be at the same level as Yggdrasil: possible, maybe even not too difficult, but without any specific help.

Zerotier: DNS Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.

Zerotier: Source code, pricing, and portability The client ZeroTier One is available on Github under a custom business source license which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is. The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.

Zerotier conclusions Frankly I don t see much reason to use Zerotier. The virtual Ethernet model seems to be a weird hybrid that doesn t bring much value. I m concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but that s esoteric enough and also solvable at other layers that it doesn t seem all that compelling to me. Add to that the strange license and, to me anyhow, I don t see much reason to bother with it.

Netmaker Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that don t have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but I m failing to dig it up now. The project is in a bit of an early state; you can sign up for an upcoming closed beta with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but it s not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide. It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most. I don t want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didn t proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesn t seem to be in a state that will meet my needs yet.

Nebula Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesn t have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker ( lighthouse ) yourself; perhaps on a $5/mo VPS. Due to the poor firewall traversal properties, I didn t do a full evaluation of Nebula, but it still has a very interesting design.

Nebula: Security and Privacy Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:
./nebula-cert sign -name "lighthouse1" -ip "192.168.100.1/24"
./nebula-cert sign -name "laptop" -ip "192.168.100.2/24" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "192.168.100.9/24" -groups "servers"
./nebula-cert sign -name "host3" -ip "192.168.100.10/24"
So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps. This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter. Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall. You can define static mappings from a Nebula mesh IP to a clearnet IP. I haven t found information on this, but theoretically if NAT traversal isn t required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I don t know if this is truly the case, however.

Nebula: Connectivity and NAT traversal This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443. I couldn t find a lot of detail on what Nebula s NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isn t as capable as Tailscale. You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but it s less versatile. You have to manually designate what relay to use. It s unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.

Nebula: Sharing with friends There is no particular support here.

Nebula: DNS Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it can t forward requests to a DNS server that s upstream for your laptop s particular current location. Actually, Nebula s DNS server doesn t forward at all. It also doesn t resolve its own name. The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but it s unclear to me if this would make each lighthouse form a complete picture of the network.

Nebula: Source code, pricing, and portability Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.

Nebula conclusions I am attracted to Nebula s unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.

Roll your own and hybrid Here is a grab bag of ideas:

Running Yggdrasil over Tailscale One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale. Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.

Public VPN provider for NAT traversal A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that. Be aware that a number of public VPN clients have a kill switch to prevent any traffic from egressing without using the VPN; see, for instance, Mullvad s. You ll need to disable this if you are running a mesh atop it.

Other

Combining with local firewalls For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you don t have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.

Performance I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.
Tool iperf3 (default) iperf3 -P 10 iperf3 -R
Direct (no VPN) 2406 2406 2764
Wireguard (kernel) 1515 1566 2027
Yggdrasil 892 1126 1105
Tailscale 950 1034 1085
Tinc 296 300 277
You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.

IP collisions When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:
  1. Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one
  2. Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources
To summarize the state of play as highlighted in the reviews above:
  • Yggdrasil derives IPv6 addresses from a public key
  • tinc allows any node to set any IP
  • Tailscale IPs aren t user-assignable, but the assignment algorithm is unknown
  • Zerotier allows any IP to be allocated to any node at the control plane
  • I don t know what Netmaker does
  • Nebula IPs are baked into the cert and signed by the CA, but I haven t verified the enforcement algorithm
So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level). So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP? Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the node s public key, making collisions technically impractical, if not outright impossible. However, if you use the /64 prefix, a collision may be more possible. Yggdrasil s hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a genkeys tool that uses more CPU cycles to generate keys that are maximally difficult to collide with. Tailscale doesn t document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well. So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.

Final thoughts For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale. Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than I d like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up. I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. That s between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos! Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so it s possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.

10 April 2023

Simon Josefsson: Trisquel is 42% Reproducible!

The absolute number may not be impressive, but what I hope is at least a useful contribution is that there actually is a number on how much of Trisquel is reproducible. Hopefully this will inspire others to help improve the actual metric. tl;dr: go to reproduce-trisquel. When I set about to understand how Trisquel worked, I identified a number of things that would improve my confidence in it. The lowest hanging fruit for me was to manually audit the package archive, and I wrote a tool called debdistdiff to automate this for me. That led me to think about apt archive transparency more in general. I have made some further work in that area (hint: apt-verify) that deserve its own blog post eventually. Most of apt archive transparency is futile if we don t trust the intended packages that are in the archive. One way to measurable increase trust in the package are to provide reproducible builds of the packages, which should by now be an established best practice. Code review is still important, but since it will never provide positive guarantees we need other processes that can identify sub-optimal situations automatically. The way reproducible builds easily identify negative results is what I believe has driven much of its success: its results are tangible and measurable. The field of software engineering is in need of more such practices. The design of my setup to build Trisquel reproducible are as follows. I did not expect to be able to use the GitLab shared runners to do the building, however they turned out to work quite well and I postponed setting up my own runner. There is a manually curated lists/disabled-aramo.txt with some packages that all required too much disk space or took over two hours to build. Today I finally took the time to setup a GitLab runner using podman running Trisquel aramo, and I expect to complete builds of the remaining packages soon one of my Dell R630 server with 256GB RAM and dual 2680v4 CPUs should deliver sufficient performance. Current limitations and ideas on further work (most are filed as project issues) include: Happy Easter Hacking! Update 2023-04-17: The original project reproduce-trisquel that was announced here has been archived and replaced with two projects, one generic debdistreproduce and one with results for Trisquel: reproduce/trisquel .

30 March 2023

Jonathan McDowell: Buttering up my storage

(TL;DR: I ve been trying out btrfs in some places instead of ext4, I ve hit absolutely zero issues and there are a few features that make me plan to use it more.) Despite (or perhaps because of) working on storage products for a reasonable chunk of my career I have tended towards a conservative approach to my filesystems. By the time I came to Linux ext2 was well established, the move to ext3 was a logical one (the joys of added journalling for faster recovery after unclean shutdowns) and for a long time my default stack has been MD raid with LVM2 on top and then ext4 as the filesystem. I ve dabbled with other filesystems; I ran XFS for a while on my VDR machine, and also when I had a large tradspool with INN, but never really had a hard requirement for it. I ve ended up adminning a machine that had JFS in the past, largely for historical reasons, but don t really remember any issues (vague recollections of NFS problems but that might just have been NFS being NFS). However. ZFS has gathered itself a significant fan base and that makes me wonder about what it can offer and whether I want that. Firstly, let s be clear that I m never going to run a primary filesystem that isn t part of the mainline kernel. So ZFS itself is out, because I run Linux. So what do I want that I can t get with ext4? Firstly, I d like data checksumming. As storage gets larger there s a bigger chance of silent data corruption and while I have backups of the important stuff that doesn t help if you don t know you need to use them. Secondly, these days I have machines running containers, VMs, or with lots of source checkouts with a reasonable amount of overlap in their data. Disk space has got cheaper, but I d still like to be able to do some sort of deduplication of common blocks. So, I ve been trying out btrfs. When I installed my desktop I went with btrfs for / and /home (I kept /boot as ext4). The thought process was that this was a local machine (so easy access if it all went wrong) and I take regular backups (so if it all went wrong I could recover). That was a year and a half ago and it s been pretty dull; I mostly forget I m running btrfs instead of ext4. This is on a machine that tracks Debian testing, so currently on kernel 6.1 but originally installed with 5.10. So it seems modern btrfs is reasonably stable for a machine that isn t driven especially hard. Good start. The fact I forget what filesystem I m running points to the fact that I m not actually doing anything special here. I get the advantage of data checksumming, but not much else. 2 things spring to mind. Firstly, I don t do snapshots. Given I run testing it might be wiser if I did take a snapshot before every apt-get upgrade, and I have a friend who does just that, but even when I ve run unstable I ve never had a machine get itself into a state that I couldn t recover so I haven t spent time investigating. I note Ubuntu has apt-btrfs-snapshot but it doesn t seem to have any updates for years. The other thing I didn t do when I installed my desktop is take advantage of subvolumes. I m still trying to get my head around exactly what I want them for, but they provide a partial replacement for LVM when it comes to carving up disk space. Instead of the separate / and /home LVs I created I could have created a single LV that would have a single btrfs filesystem on it. / and /home would then be separate subvolumes, allowing me to snapshot each individually. Quotas can also be applied separately so there s still the potential to prevent one subvolume taking all available space. Encouraged by the lack of hassle with my desktop I decided to try moving my sbuild machine over to use btrfs for its build chroots. For Reasons this is a VM kindly hosted by a friend, rather than something local. To be honest these days I would probably go for local hosting, but it works and there s no strong reason to move. The point is it s remote, and so if migrating went wrong and I had to ask for assistance I d be bothering someone who s doing me a favour as it is. The build VM is, of course, running LVM, and there was luckily some free space available. I m reasonably sure the underlying storage involves spinning rust, so I did a laborious set of pvmove commands to make sure all the available space was at the start of the PV, and created a new btrfs volume there. I was advised that while btrfs-convert would do the job it was better to create a fresh filesystem where possible. This time I did create an initial root subvolume. Configuring up sbuild was then much simpler than I d expected. My setup originally started out as a set of tarballs for the chroots that would get untarred + used for the builds, which is pretty slow. Once overlayfs was mature enough I switched to that. I d had a conversation with Enrico about his nspawn/btrfs setup, but it turned out Russ Allbery had written an excellent set of instructions on sbuild with btrfs. I tweaked my existing setup based on his details, and I was in business. Each chroot is a separate subvolume - I don t actually end up having to mount them individually, but it means that only the chroot in use gets snapshotted. For example during a build the following can be observed:
# btrfs subvolume list /
ID 257 gen 111534 top level 5 path root
ID 271 gen 111525 top level 257 path srv/chroot/unstable-amd64-sbuild
ID 275 gen 27873 top level 257 path srv/chroot/bullseye-amd64-sbuild
ID 276 gen 27873 top level 257 path srv/chroot/buster-amd64-sbuild
ID 343 gen 111533 top level 257 path srv/chroot/snapshots/unstable-amd64-sbuild-328059a0-e74b-4d9f-be70-24b59ccba121
I was a little confused about whether I d got something wrong because the snapshot top level is listed as 257 rather than 271, but digging further with btrfs subvolume show on the 2 mounted directories correctly showed the snapshot had a parent equal to the chroot, not /. As a final step I ran jdupes via jdupes -1Br / to deduplicate things across the filesystem. It didn t end up providing a significant saving unfortunately - I guess there s a reasonable amount of change between Debian releases - but I think tried it on my desktop, which tends to have a large number of similar source trees checked out. There I managed to save about 5% on /home, which didn t seem too shabby. The sbuild setup has been in place for a couple of months now, and I ve run quite a few builds on it while preparing for the freeze. So I m fairly confident in the stability of the setup and my next move is to transition my local house server over to btrfs for its containers (which all run under systemd-nspawn). Those are generally running a Debian stable base so there should be a decent amount of commonality for deduping. I m not saying I m yet at the point where I ll default to btrfs on new installs, but I m definitely looking at it for situations where I think I can get benefits from deduplication, or being able to divide up disk space without hard partitioning space. (And, just to answer the worry I had when I started, I ve got nowhere near ENOSPC problems, but I believe they re handled much more gracefully these days. And my experience of ZFS when it got above 90% utilization was far from ideal too.)

28 March 2023

Shirish Agarwal: Three Ancient Civilizations, Tesla, IMU and other things.

Three Ancient Civilizations I have been reading books for a long time but somehow I don t know how or when I realized that there are three medieval civilizations that time and again seem to fascinate the authors, either American or European. The three civilizations that do get mentioned every now and then are Egyptian, Greece and Roman and I have no clue as to why. Another two civilizations closely follow them, Mesopotian and Sumerian Civilizations. Why most of the authors irrespective of genre are mystified by these 5 civilizations is beyond me but they also conjure lot of imagination. Now if I was on the right side of 40, maybe 22-25 onwards and had the means or the opportunity or both, I would have gone and tried to learn as much as I can about these various civilizations. There is still a lot of enigma attached and it seems that the official explanations of how these various civilizations ended seems too good to be true or whatever. One of the more interesting points has been how Greece mythology were subverted and flipped to make Roman mythology. Apparently, many Greek gods have a Roman aspect and the qualities are opposite to the Greek aspect. I haven t learned the reason why it is so, yet. Apart from the Iziko Natural Museum in South Africa which did have its share of wonders (the whole South African experience was surreal) nowhere I have witnessed stuff from the above other than in Africa. The whole thing seemed just surreal. I could go on but as I don t know what is real and what is mythology when it comes to various cultures including whatever little I experienced in Africa.

Recent History Having said the above, I also find that many Indians somehow do not either know or are not interested in understanding recent history. I am talking of the time between 1800s and 1950 s. British apologist Mr. William Dalrymple in his book Anarchy has shared how the East Indian Company looted India and gave less than half back to the British. So where did the rest of the money go ? It basically went to the tax free havens around UK. That tale is very much similar as to how the Axis gold was used to have an entity called Switzerland from scratch. There have been books as well as couple of miniseries or two that document that fact. But for me this is not the real story, this is more of a side dish per-se. The real story perhaps is how the EU peace project was born. Because of Hitler s rise and the way World War 2 happened, the Allies knew that they couldn t subjugate Germany through another humiliation as they had after World War 1, who knows another Hitler might come. So while they did fine the Germans, they also helped them via the Marshall plan. Germany also realized that it had been warring with other European countries for over a thousand years as well as quite a few other countries. The first sort of treaty immediately in that regard was the Western Union Alliance . While on the face of it, it was a defense treaty between sovereign powers. Interestingly, as can be seen UK at that point in time wanted more countries to be part of the Treaty of Dunkirk. This was quickly followed 2 years later by the Treaty of London which made way for the Council of Europe to be formed. The reason I am sharing is because a lot of Indians whom I meet on SM do not know that UK was instrumental front and center for the formation of European Union (EU). Also they perhaps don t realize that after World War 2, UK was greatly diminished both financially and militarily. That is the reason it gave back its erstwhile colonies including India and other countries. If UK hadn t become a part of EU they would have been called the poor man of Europe as they had been called few hundred years ago. In fact, after Brexit, UK has been the only nation that has fallen on the dire times as much as it has. They have practically no food, supermarkets running out of veggies and whatnot. electricity sky-high as they don t want to curtail profits of the gas companies which in turn donate money to Tory coffers. Sadly, most of my brethen do not know that hence I have had to share it. The EU became a power in its own right due to Gravity model of trade. The U.S. is attempting the same thing and calls it near off-shoring.

Tesla, Investor Day Presentation, Toyota and Free Software The above brings it nicely about the Tesla Investor Day that happened couple of weeks back. The biggest news though was broken just 24 hours earlier when the governor of Monterrey] shared how Giga New Mexico would be happening and shared some site photographs. There have been bits of news on that off-and-on since then. Toyota meanwhile has been putting lot of anti-EV posters and whatnot. In fact, India seems to be mirroring Japan in a lot of ways, the only difference is Japan is still superior economically than India. I do sincerely hope that at least Japan can get out of its lost decades. I have asked privately if any of the Japanese translators would be willing to translate the anti-EV poster so we may all know.
Anti-EV poster by Toyota

Urinal outside UK Embassy About a week back few people chanting Khalistan entered the Indian Embassy and showed the flags. This was in the UK. Now in a retaliatory move against a friendly nation, we want to make a Urinal. after making a security downgrade for the Embassy. The pettiness being shown by Indian Govt. especially when it has very few friends. There are also plans to do the same for the U.S. and other embassies as well. I do not know when we will get common sense.

Holi Just a few weeks back, Holi happened. It used to be a sweet and innocent festival. But from the last few years, I have been hearing and seeing sexual harassment on the rise. In fact, saw quite a few lewd posts written to women on Holi or verge of Holi and also videos of the same. One of the most shameful incidents occurred with a Japanese tourist. She was not only sexually molested but also terrorized that if she were to report then she could be raped and murdered. She promptly left Delhi. Once it was reported in mass media, Delhi Police tried to show it was doing something. FWIW, Delhi s crime against women stats have been at record high and conviction at record low.

Ecommerce Rules being changed, logistics gonna be tough. Just today there have been changes in Ecommerce rules (again) and this is gonna be a pain for almost all companies big and small with the exception of Adani and Ambani. Almost all players including the Tatas have called them out. Of course, all such laws have been passed without debate. In such a scenario, small startups like these cannot hope to grow their business.

Inertial Measurement Unit or Full Body Tracking with cheap hardware. Apparently, a whole host of companies are looking at 3-D tracking using cheap hardware apparently known as IMU sensors. Sooner than later cheap 3-D glasses and IMU sensors should explode the 3-D market worldwide. There is a huge potential and upside to it and will probably overtake smartphones as well. But then the danger will be of our thoughts, ideas, nightmares etc. to be shared without our consent. The more we tap into a virtual world, what stops anybody from tapping into our brain and practically stealing our identity in more than one way. I dunno if there are any Debian people or FSF projects working on the above. Even Laws can only do so much, until and unless there are alternative places and ways it would be difficult to say the least

13 March 2023

Antoine Beaupr : Framework 12th gen laptop review

The Framework is a 13.5" laptop body with swappable parts, which makes it somewhat future-proof and certainly easily repairable, scoring an "exceedingly rare" 10/10 score from ifixit.com. There are two generations of the laptop's main board (both compatible with the same body): the Intel 11th and 12th gen chipsets. I have received my Framework, 12th generation "DIY", device in late September 2022 and will update this page as I go along in the process of ordering, burning-in, setting up and using the device over the years. Overall, the Framework is a good laptop. I like the keyboard, the touch pad, the expansion cards. Clearly there's been some good work done on industrial design, and it's the most repairable laptop I've had in years. Time will tell, but it looks sturdy enough to survive me many years as well. This is also one of the most powerful devices I ever lay my hands on. I have managed, remotely, more powerful servers, but this is the fastest computer I have ever owned, and it fits in this tiny case. It is an amazing machine. On the downside, there's a bit of proprietary firmware required (WiFi, Bluetooth, some graphics) and the Framework ships with a proprietary BIOS, with currently no Coreboot support. Expect to need the latest kernel, firmware, and hacking around a bunch of things to get resolution and keybindings working right. Like others, I have first found significant power management issues, but many issues can actually be solved with some configuration. Some of the expansion ports (HDMI, DP, MicroSD, and SSD) use power when idle, so don't expect week-long suspend, or "full day" battery while those are plugged in. Finally, the expansion ports are nice, but there's only four of them. If you plan to have a two-monitor setup, you're likely going to need a dock. Read on for the detailed review. For context, I'm moving from the Purism Librem 13v4 because it basically exploded on me. I had, in the meantime, reverted back to an old ThinkPad X220, so I sometimes compare the Framework with that venerable laptop as well. This blog post has been maturing for months now. It started in September 2022 and I declared it completed in March 2023. It's the longest single article on this entire website, currently clocking at about 13,000 words. It will take an average reader a full hour to go through this thing, so I don't expect anyone to actually do that. This introduction should be good enough for most people, read the first section if you intend to actually buy a Framework. Jump around the table of contents as you see fit for after you did buy the laptop, as it might include some crucial hints on how to make it work best for you, especially on (Debian) Linux.

Advice for buyers Those are things I wish I would have known before buying:
  1. consider buying 4 USB-C expansion cards, or at least a mix of 4 USB-A or USB-C cards, as they use less power than other cards and you do want to fill those expansion slots otherwise they snag around and feel insecure
  2. you will likely need a dock or at least a USB hub if you want a two-monitor setup, otherwise you'll run out of ports
  3. you have to do some serious tuning to get proper (10h+ idle, 10 days suspend) power savings
  4. in particular, beware that the HDMI, DisplayPort and particularly the SSD and MicroSD cards take a significant amount power, even when sleeping, up to 2-6W for the latter two
  5. beware that the MicroSD card is what it says: Micro, normal SD cards won't fit, and while there might be full sized one eventually, it's currently only at the prototyping stage
  6. the Framework monitor has an unusual aspect ratio (3:2): I like it (and it matches classic and digital photography aspect ratio), but it might surprise you

Current status I have the framework! It's setup with a fresh new Debian bookworm installation. I've ran through a large number of tests and burn in. I have decided to use the Framework as my daily driver, and had to buy a USB-C dock to get my two monitors connected, which was own adventure. Update: Framework just (2023-03-23) just announced a whole bunch of new stuff: The recording is available in this video and it's not your typical keynote. It starts ~25 minutes late, audio is crap, lightning and camera are crap, clapping seems to be from whatever staff they managed to get together in a room, decor is bizarre, colors are shit. It's amazing.

Specifications Those are the specifications of the 12th gen, in general terms. Your build will of course vary according to your needs.
  • CPU: i5-1240P, i7-1260P, or i7-1280P (Up to 4.4-4.8 GHz, 4+8 cores), Iris Xe graphics
  • Storage: 250-4000GB NVMe (or bring your own)
  • Memory: 8-64GB DDR4-3200 (or bring your own)
  • WiFi 6e (AX210, vPro optional, or bring your own)
  • 296.63mm X 228.98mm X 15.85mm, 1.3Kg
  • 13.5" display, 3:2 ratio, 2256px X 1504px, 100% sRGB, >400 nit
  • 4 x USB-C user-selectable expansion ports, including
    • USB-C
    • USB-A
    • HDMI
    • DP
    • Ethernet
    • MicroSD
    • 250-1000GB SSD
  • 3.5mm combo headphone jack
  • Kill switches for microphone and camera
  • Battery: 55Wh
  • Camera: 1080p 60fps
  • Biometrics: Fingerprint Reader
  • Backlit keyboard
  • Power Adapter: 60W USB-C (or bring your own)
  • ships with a screwdriver/spludger
  • 1 year warranty
  • base price: 1000$CAD, but doesn't give you much, typical builds around 1500-2000$CAD

Actual build This is the actual build I ordered. Amounts in CAD. (1CAD = ~0.75EUR/USD.)

Base configuration
  • CPU: Intel Core i5-1240P (AKA Alder Lake P 8 4.4GHz P-threads, 8 3.2GHz E-threads, 16 total, 28-64W), 1079$
  • Memory: 16GB (1 x 16GB) DDR4-3200, 104$

Customization
  • Keyboard: US English, included

Expansion Cards
  • 2 USB-C $24
  • 3 USB-A $36
  • 2 HDMI $50
  • 1 DP $50
  • 1 MicroSD $25
  • 1 Storage 1TB $199
  • Sub-total: 384$

Accessories
  • Power Adapter - US/Canada $64.00

Total
  • Before tax: 1606$
  • After tax and duties: 1847$
  • Free shipping

Quick evaluation This is basically the TL;DR: here, just focusing on broad pros/cons of the laptop.

Pros

Cons
  • the 11th gen is out of stock, except for the higher-end CPUs, which are much less affordable (700$+)
  • the 12th gen has compatibility issues with Debian, followup in the DebianOn page, but basically: brightness hotkeys, power management, wifi, the webcam is okay even though the chipset is the infamous alder lake because it does not have the fancy camera; most issues currently seem solvable, and upstream is working with mainline to get their shit working
  • 12th gen might have issues with thunderbolt docks
  • they used to have some difficulty keeping up with the orders: first two batches shipped, third batch sold out, fourth batch should have shipped (?) in October 2021. they generally seem to keep up with shipping. update (august 2022): they rolled out a second line of laptops (12th gen), first batch shipped, second batch shipped late, September 2022 batch was generally on time, see this spreadsheet for a crowdsourced effort to track those supply chain issues seem to be under control as of early 2023. I got the Ethernet expansion card shipped within a week.
  • compared to my previous laptop (Purism Librem 13v4), it feels strangely bulkier and heavier; it's actually lighter than the purism (1.3kg vs 1.4kg) and thinner (15.85mm vs 18mm) but the design of the Purism laptop (tapered edges) makes it feel thinner
  • no space for a 2.5" drive
  • rather bright LED around power button, but can be dimmed in the BIOS (not low enough to my taste) I got used to it
  • fan quiet when idle, but can be noisy when running, for example if you max a CPU for a while
  • battery described as "mediocre" by Ars Technica (above), confirmed poor in my tests (see below)
  • no RJ-45 port, and attempts at designing ones are failing because the modular plugs are too thin to fit (according to Linux After Dark), so unlikely to have one in the future Update: they cracked that nut and ship an 2.5 gbps Ethernet expansion card with a realtek chipset, without any firmware blob (!)
  • a bit pricey for the performance, especially when compared to the competition (e.g. Dell XPS, Apple M1)
  • 12th gen Intel has glitchy graphics, seems like Intel hasn't fully landed proper Linux support for that chipset yet

Initial hardware setup A breeze.

Accessing the board The internals are accessed through five TorX screws, but there's a nice screwdriver/spudger that works well enough. The screws actually hold in place so you can't even lose them. The first setup is a bit counter-intuitive coming from the Librem laptop, as I expected the back cover to lift and give me access to the internals. But instead the screws is release the keyboard and touch pad assembly, so you actually need to flip the laptop back upright and lift the assembly off (!) to get access to the internals. Kind of scary. I also actually unplugged a connector in lifting the assembly because I lifted it towards the monitor, while you actually need to lift it to the right. Thankfully, the connector didn't break, it just snapped off and I could plug it back in, no harm done. Once there, everything is well indicated, with QR codes all over the place supposedly leading to online instructions.

Bad QR codes Unfortunately, the QR codes I tested (in the expansion card slot, the memory slot and CPU slots) did not actually work so I wonder how useful those actually are. After all, they need to point to something and that means a URL, a running website that will answer those requests forever. I bet those will break sooner than later and in fact, as far as I can tell, they just don't work at all. I prefer the approach taken by the MNT reform here which designed (with the 100 rabbits folks) an actual paper handbook (PDF). The first QR code that's immediately visible from the back of the laptop, in an expansion cord slot, is a 404. It seems to be some serial number URL, but I can't actually tell because, well, the page is a 404. I was expecting that bar code to lead me to an introduction page, something like "how to setup your Framework laptop". Support actually confirmed that it should point a quickstart guide. But in a bizarre twist, they somehow sent me the URL with the plus (+) signs escaped, like this:
https://guides.frame.work/Guide/Framework\+Laptop\+DIY\+Edition\+Quick\+Start\+Guide/57
... which Firefox immediately transforms in:
https://guides.frame.work/Guide/Framework/+Laptop/+DIY/+Edition/+Quick/+Start/+Guide/57
I'm puzzled as to why they would send the URL that way, the proper URL is of course:
https://guides.frame.work/Guide/Framework+Laptop+DIY+Edition+Quick+Start+Guide/57
(They have also "let the team know about this for feedback and help resolve the problem with the link" which is a support code word for "ha-ha! nope! not my problem right now!" Trust me, I know, my own code word is "can you please make a ticket?")

Seating disks and memory The "DIY" kit doesn't actually have that much of a setup. If you bought RAM, it's shipped outside the laptop in a little plastic case, so you just seat it in as usual. Then you insert your NVMe drive, and, if that's your fancy, you also install your own mPCI WiFi card. If you ordered one (which was my case), it's pre-installed. Closing the laptop is also kind of amazing, because the keyboard assembly snaps into place with magnets. I have actually used the laptop with the keyboard unscrewed as I was putting the drives in and out, and it actually works fine (and will probably void your warranty, so don't do that). (But you can.) (But don't, really.)

Hardware review

Keyboard and touch pad The keyboard feels nice, for a laptop. I'm used to mechanical keyboard and I'm rather violent with those poor things. Yet the key travel is nice and it's clickety enough that I don't feel too disoriented. At first, I felt the keyboard as being more laggy than my normal workstation setup, but it turned out this was a graphics driver issues. After enabling a composition manager, everything feels snappy. The touch pad feels good. The double-finger scroll works well enough, and I don't have to wonder too much where the middle button is, it just works. Taps don't work, out of the box: that needs to be enabled in Xorg, with something like this:
cat > /etc/X11/xorg.conf.d/40-libinput.conf <<EOF
Section "InputClass"
      Identifier "libinput touch pad catchall"
      MatchIsTouchpad "on"
      MatchDevicePath "/dev/input/event*"
      Driver "libinput"
      Option "Tapping" "on"
      Option "TappingButtonMap" "lmr"
EndSection
EOF
But be aware that once you enable that tapping, you'll need to deal with palm detection... So I have not actually enabled this in the end.

Power button The power button is a little dangerous. It's quite easy to hit, as it's right next to one expansion card where you are likely to plug in a cable power. And because the expansion cards are kind of hard to remove, you might squeeze the laptop (and the power key) when trying to remove the expansion card next to the power button. So obviously, don't do that. But that's not very helpful. An alternative is to make the power button do something else. With systemd-managed systems, it's actually quite easy. Add a HandlePowerKey stanza to (say) /etc/systemd/logind.conf.d/power-suspends.conf:
[Login]
HandlePowerKey=suspend
HandlePowerKeyLongPress=poweroff
You might have to create the directory first:
mkdir /etc/systemd/logind.conf.d/
Then restart logind:
systemctl restart systemd-logind
And the power button will suspend! Long-press to power off doesn't actually work as the laptop immediately suspends... Note that there's probably half a dozen other ways of doing this, see this, this, or that.

Special keybindings There is a series of "hidden" (as in: not labeled on the key) keybindings related to the fn keybinding that I actually find quite useful.
Key Equivalent Effect Command
p Pause lock screen xset s activate
b Break ? ?
k ScrLk switch keyboard layout N/A
It looks like those are defined in the microcontroller so it would be possible to add some. For example, the SysRq key is almost bound to fn s in there. Note that most other shortcuts like this are clearly documented (volume, brightness, etc). One key that's less obvious is F12 that only has the Framework logo on it. That actually calls the keysym XF86AudioMedia which, interestingly, does absolutely nothing here. By default, on Windows, it opens your browser to the Framework website and, on Linux, your "default media player". The keyboard backlight can be cycled with fn-space. The dimmer version is dim enough, and the keybinding is easy to find in the dark. A skinny elephant would be performed with alt PrtScr (above F11) KEY, so for example alt fn F11 b should do a hard reset. This comment suggests you need to hold the fn only if "function lock" is on, but that's actually the opposite of my experience. Out of the box, some of the fn keys don't work. Mute, volume up/down, brightness, monitor changes, and the airplane mode key all do basically nothing. They don't send proper keysyms to Xorg at all. This is a known problem and it's related to the fact that the laptop has light sensors to adjust the brightness automatically. Somehow some of those keys (e.g. the brightness controls) are supposed to show up as a different input device, but don't seem to work correctly. It seems like the solution is for the Framework team to write a driver specifically for this, but so far no progress since July 2022. In the meantime, the fancy functionality can be supposedly disabled with:
echo 'blacklist hid_sensor_hub'   sudo tee /etc/modprobe.d/framework-als-blacklist.conf
... and a reboot. This solution is also documented in the upstream guide. Note that there's another solution flying around that fixes this by changing permissions on the input device but I haven't tested that or seen confirmation it works.

Kill switches The Framework has two "kill switches": one for the camera and the other for the microphone. The camera one actually disconnects the USB device when turned off, and the mic one seems to cut the circuit. It doesn't show up as muted, it just stops feeding the sound. Both kill switches are around the main camera, on top of the monitor, and quite discreet. Then turn "red" when enabled (i.e. "red" means "turned off").

Monitor The monitor looks pretty good to my untrained eyes. I have yet to do photography work on it, but some photos I looked at look sharp and the colors are bright and lively. The blacks are dark and the screen is bright. I have yet to use it in full sunlight. The dimmed light is very dim, which I like.

Screen backlight I bind brightness keys to xbacklight in i3, but out of the box I get this error:
sep 29 22:09:14 angela i3[5661]: No outputs have backlight property
It just requires this blob in /etc/X11/xorg.conf.d/backlight.conf:
Section "Device"
    Identifier  "Card0"
    Driver      "intel"
    Option      "Backlight"  "intel_backlight"
EndSection
This way I can control the actual backlight power with the brightness keys, and they do significantly reduce power usage.

Multiple monitor support I have been able to hook up my two old monitors to the HDMI and DisplayPort expansion cards on the laptop. The lid closes without suspending the machine, and everything works great. I actually run out of ports, even with a 4-port USB-A hub, which gives me a total of 7 ports:
  1. power (USB-C)
  2. monitor 1 (DisplayPort)
  3. monitor 2 (HDMI)
  4. USB-A hub, which adds:
  5. keyboard (USB-A)
  6. mouse (USB-A)
  7. Yubikey
  8. external sound card
Now the latter, I might be able to get rid of if I switch to a combo-jack headset, which I do have (and still need to test). But still, this is a problem. I'll probably need a powered USB-C dock and better monitors, possibly with some Thunderbolt chaining, to save yet more ports. But that means more money into this setup, argh. And figuring out my monitor situation is the kind of thing I'm not that big of a fan of. And neither is shopping for USB-C (or is it Thunderbolt?) hubs. My normal autorandr setup doesn't work: I have tried saving a profile and it doesn't get autodetected, so I also first need to do:
autorandr -l framework-external-dual-lg-acer
The magic:
autorandr -l horizontal
... also works well. The worst problem with those monitors right now is that they have a radically smaller resolution than the main screen on the laptop, which means I need to reset the font scaling to normal every time I switch back and forth between those monitors and the laptop, which means I actually need to do this:
autorandr -l horizontal &&
eho Xft.dpi: 96   xrdb -merge &&
systemctl restart terminal xcolortaillog background-image emacs &&
i3-msg restart
Kind of disruptive.

Expansion ports I ordered a total of 10 expansion ports. I did manage to initialize the 1TB drive as an encrypted storage, mostly to keep photos as this is something that takes a massive amount of space (500GB and counting) and that I (unfortunately) don't work on very often (but still carry around). The expansion ports are fancy and nice, but not actually that convenient. They're a bit hard to take out: you really need to crimp your fingernails on there and pull hard to take them out. There's a little button next to them to release, I think, but at first it feels a little scary to pull those pucks out of there. You get used to it though, and it's one of those things you can do without looking eventually. There's only four expansion ports. Once you have two monitors, the drive, and power plugged in, bam, you're out of ports; there's nowhere to plug my Yubikey. So if this is going to be my daily driver, with a dual monitor setup, I will need a dock, which means more crap firmware and uncertainty, which isn't great. There are actually plans to make a dual-USB card, but that is blocked on designing an actual board for this. I can't wait to see more expansion ports produced. There's a ethernet expansion card which quickly went out of stock basically the day it was announced, but was eventually restocked. I would like to see a proper SD-card reader. There's a MicroSD card reader, but that obviously doesn't work for normal SD cards, which would be more broadly compatible anyways (because you can have a MicroSD to SD card adapter, but I have never heard of the reverse). Someone actually found a SD card reader that fits and then someone else managed to cram it in a 3D printed case, which is kind of amazing. Still, I really like that idea that I can carry all those little adapters in a pouch when I travel and can basically do anything I want. It does mean I need to shuffle through them to find the right one which is a little annoying. I have an elastic band to keep them lined up so that all the ports show the same side, to make it easier to find the right one. But that quickly gets undone and instead I have a pouch full of expansion cards. Another awesome thing with the expansion cards is that they don't just work on the laptop: anything that takes USB-C can take those cards, which means you can use it to connect an SD card to your phone, for backups, for example. Heck, you could even connect an external display to your phone that way, assuming that's supported by your phone of course (and it probably isn't). The expansion ports do take up some power, even when idle. See the power management section below, and particularly the power usage tests for details.

USB-C charging One thing that is really a game changer for me is USB-C charging. It's hard to overstate how convenient this is. I often have a USB-C cable lying around to charge my phone, and I can just grab that thing and pop it in my laptop. And while it will obviously not charge as fast as the provided charger, it will stop draining the battery at least. (As I wrote this, I had the laptop plugged in the Samsung charger that came with a phone, and it was telling me it would take 6 hours to charge the remaining 15%. With the provided charger, that flew down to 15 minutes. Similarly, I can power the laptop from the power grommet on my desk, reducing clutter as I have that single wire out there instead of the bulky power adapter.) I also really like the idea that I can charge my laptop with a power bank or, heck, with my phone, if push comes to shove. (And vice-versa!) This is awesome. And it works from any of the expansion ports, of course. There's a little led next to the expansion ports as well, which indicate the charge status:
  • red/amber: charging
  • white: charged
  • off: unplugged
I couldn't find documentation about this, but the forum answered. This is something of a recurring theme with the Framework. While it has a good knowledge base and repair/setup guides (and the forum is awesome) but it doesn't have a good "owner manual" that shows you the different parts of the laptop and what they do. Again, something the MNT reform did well. Another thing that people are asking about is an external sleep indicator: because the power LED is on the main keyboard assembly, you don't actually see whether the device is active or not when the lid is closed. Finally, I wondered what happens when you plug in multiple power sources and it turns out the charge controller is actually pretty smart: it will pick the best power source and use it. The only downside is it can't use multiple power sources, but that seems like a bit much to ask.

Multimedia and other devices Those things also work:
  • webcam: splendid, best webcam I've ever had (but my standards are really low)
  • onboard mic: works well, good gain (maybe a bit much)
  • onboard speakers: sound okay, a little metal-ish, loud enough to be annoying, see this thread for benchmarks, apparently pretty good speakers
  • combo jack: works, with slight hiss, see below
There's also a light sensor, but it conflicts with the keyboard brightness controls (see above). There's also an accelerometer, but it's off by default and will be removed from future builds.

Combo jack mic tests The Framework laptop ships with a combo jack on the left side, which allows you to plug in a CTIA (source) headset. In human terms, it's a device that has both a stereo output and a mono input, typically a headset or ear buds with a microphone somewhere. It works, which is better than the Purism (which only had audio out), but is on par for the course for that kind of onboard hardware. Because of electrical interference, such sound cards very often get lots of noise from the board. With a Jabra Evolve 40, the built-in USB sound card generates basically zero noise on silence (invisible down to -60dB in Audacity) while plugging it in directly generates a solid -30dB hiss. There is a noise-reduction system in that sound card, but the difference is still quite striking. On a comparable setup (curie, a 2017 Intel NUC), there is also a his with the Jabra headset, but it's quieter, more in the order of -40/-50 dB, a noticeable difference. Interestingly, testing with my Mee Audio Pro M6 earbuds leads to a little more hiss on curie, more on the -35/-40 dB range, close to the Framework. Also note that another sound card, the Antlion USB adapter that comes with the ModMic 4, also gives me pretty close to silence on a quiet recording, picking up less than -50dB of background noise. It's actually probably picking up the fans in the office, which do make audible noises. In other words, the hiss of the sound card built in the Framework laptop is so loud that it makes more noise than the quiet fans in the office. Or, another way to put it is that two USB sound cards (the Jabra and the Antlion) are able to pick up ambient noise in my office but not the Framework laptop. See also my audio page.

Performance tests

Compiling Linux 5.19.11 On a single core, compiling the Debian version of the Linux kernel takes around 100 minutes:
5411.85user 673.33system 1:37:46elapsed 103%CPU (0avgtext+0avgdata 831700maxresident)k
10594704inputs+87448000outputs (9131major+410636783minor)pagefaults 0swaps
This was using 16 watts of power, with full screen brightness. With all 16 cores (make -j16), it takes less than 25 minutes:
19251.06user 2467.47system 24:13.07elapsed 1494%CPU (0avgtext+0avgdata 831676maxresident)k
8321856inputs+87427848outputs (30792major+409145263minor)pagefaults 0swaps
I had to plug the normal power supply after a few minutes because battery would actually run out using my desk's power grommet (34 watts). During compilation, fans were spinning really hard, quite noisy, but not painfully so. The laptop was sucking 55 watts of power, steadily:
  Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Average  87.9   0.0  10.7   1.4   0.1 17.8 6583.6 5054.3 233.0 223.9 233.1  55.96
 GeoMean  87.9   0.0  10.6   1.2   0.0 17.6 6427.8 5048.1 227.6 218.7 227.7  55.96
  StdDev   1.4   0.0   1.2   0.6   0.2  3.0 1436.8  255.5 50.0 47.5 49.7   0.20
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Minimum  85.0   0.0   7.8   0.5   0.0 13.0 3594.0 4638.0 117.0 111.0 120.0  55.52
 Maximum  90.8   0.0  12.9   3.5   0.8 38.0 10174.0 5901.0 374.0 362.0 375.0  56.41
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
Summary:
CPU:  55.96 Watts on average with standard deviation 0.20
Note: power read from RAPL domains: package-0, uncore, package-0, core, psys.
These readings do not cover all the hardware in this device.

memtest86+ I ran Memtest86+ v6.00b3. It shows something like this:
Memtest86+ v6.00b3        12th Gen Intel(R) Core(TM) i5-1240P
CLK/Temp: 2112MHz    78/78 C   Pass  2% #
L1 Cache:   48KB    414 GB/s   Test 46% ##################
L2 Cache: 1.25MB    118 GB/s   Test #3 [Moving inversions, 1s & 0s] 
L3 Cache:   12MB     43 GB/s   Testing: 16GB - 18GB [1GB of 15.7GB]
Memory  :  15.7GB  14.9 GB/s   Pattern: 
--------------------------------------------------------------------------------
CPU: 4P+8E-Cores (16T)    SMP: 8T (PAR))    Time:  0:27:23  Status: Pass     \
RAM: 1600MHz (DDR4-3200) CAS 22-22-22-51    Pass:  1        Errors: 0
--------------------------------------------------------------------------------
Memory SPD Information
----------------------
 - Slot 2: 16GB DDR-4-3200 - Crucial CT16G4SFRA32A.C16FP (2022-W23)
                          Framework FRANMACP04
 <ESC> Exit  <F1> Configuration  <Space> Scroll Lock            6.00.unknown.x64
So about 30 minutes for a full 16GB memory test.

Software setup Once I had everything in the hardware setup, I figured, voil , I'm done, I'm just going to boot this beautiful machine and I can get back to work. I don't understand why I am so na ve some times. It's mind boggling. Obviously, it didn't happen that way at all, and I spent the best of the three following days tinkering with the laptop.

Secure boot and EFI First, I couldn't boot off of the NVMe drive I transferred from the previous laptop (the Purism) and the BIOS was not very helpful: it was just complaining about not finding any boot device, without dropping me in the real BIOS. At first, I thought it was a problem with my NVMe drive, because it's not listed in the compatible SSD drives from upstream. But I figured out how to enter BIOS (press F2 manically, of course), which showed the NVMe drive was actually detected. It just didn't boot, because it was an old (2010!!) Debian install without EFI. So from there, I disabled secure boot, and booted a grml image to try to recover. And by "boot" I mean, I managed to get to the grml boot loader which promptly failed to load its own root file system somehow. I still have to investigate exactly what happened there, but it failed some time after the initrd load with:
Unable to find medium containing a live file system
This, it turns out, was fixed in Debian lately, so a daily GRML build will not have this problems. The upcoming 2022 release (likely 2022.10 or 2022.11) will also get the fix. I did manage to boot the development version of the Debian installer which was a surprisingly good experience: it mounted the encrypted drives and did everything pretty smoothly. It even offered me to reinstall the boot loader, but that ultimately (and correctly, as it turns out) failed because I didn't have a /boot/efi partition. At this point, I realized there was no easy way out of this, and I just proceeded to completely reinstall Debian. I had a spare NVMe drive lying around (backups FTW!) so I just swapped that in, rebooted in the Debian installer, and did a clean install. I wanted to switch to bookworm anyways, so I guess that's done too.

Storage limitations Another thing that happened during setup is that I tried to copy over the internal 2.5" SSD drive from the Purism to the Framework 1TB expansion card. There's no 2.5" slot in the new laptop, so that's pretty much the only option for storage expansion. I was tired and did something wrong. I ended up wiping the partition table on the original 2.5" drive. Oops. It might be recoverable, but just restoring the partition table didn't work either, so I'm not sure how I recover the data there. Normally, everything on my laptops and workstations is designed to be disposable, so that wasn't that big of a problem. I did manage to recover most of the data thanks to git-annex reinit, but that was a little hairy.

Bootstrapping Puppet Once I had some networking, I had to install all the packages I needed. The time I spent setting up my workstations with Puppet has finally paid off. What I actually did was to restore two critical directories:
/etc/ssh
/var/lib/puppet
So that I would keep the previous machine's identity. That way I could contact the Puppet server and install whatever was missing. I used my Puppet optimization trick to do a batch install and then I had a good base setup, although not exactly as it was before. 1700 packages were installed manually on angela before the reinstall, and not in Puppet. I did not inspect each one individually, but I did go through /etc and copied over more SSH keys, for backups and SMTP over SSH.

LVFS support It looks like there's support for the (de-facto) standard LVFS firmware update system. At least I was able to update the UEFI firmware with a simple:
apt install fwupd-amd64-signed
fwupdmgr refresh
fwupdmgr get-updates
fwupdmgr update
Nice. The 12th gen BIOS updates, currently (January 2023) beta, can be deployed through LVFS with:
fwupdmgr enable-remote lvfs-testing
echo 'DisableCapsuleUpdateOnDisk=true' >> /etc/fwupd/uefi_capsule.conf 
fwupdmgr update
Those instructions come from the beta forum post. I performed the BIOS update on 2023-01-16T16:00-0500.

Resolution tweaks The Framework laptop resolution (2256px X 1504px) is big enough to give you a pretty small font size, so welcome to the marvelous world of "scaling". The Debian wiki page has a few tricks for this.

Console This will make the console and grub fonts more readable:
cat >> /etc/default/console-setup <<EOF
FONTFACE="Terminus"
FONTSIZE=32x16
EOF
echo GRUB_GFXMODE=1024x768 >> /etc/default/grub
update-grub

Xorg Adding this to your .Xresources will make everything look much bigger:
! 1.5*96
Xft.dpi: 144
Apparently, some of this can also help:
! These might also be useful depending on your monitor and personal preference:
Xft.autohint: 0
Xft.lcdfilter:  lcddefault
Xft.hintstyle:  hintfull
Xft.hinting: 1
Xft.antialias: 1
Xft.rgba: rgb
It my experience it also makes things look a little fuzzier, which is frustrating because you have this awesome monitor but everything looks out of focus. Just bumping Xft.dpi by a 1.5 factor looks good to me. The Debian Wiki has a page on HiDPI, but it's not as good as the Arch Wiki, where the above blurb comes from. I am not using the latter because I suspect it's causing some of the "fuzziness". TODO: find the equivalent of this GNOME hack in i3? (gsettings set org.gnome.mutter experimental-features "['scale-monitor-framebuffer']"), taken from this Framework guide

Issues

BIOS configuration The Framework BIOS has some minor issues. One issue I personally encountered is that I had disabled Quick boot and Quiet boot in the BIOS to diagnose the above boot issues. This, in turn, triggers a bug where the BIOS boot manager (F12) would just hang completely. It would also fail to boot from an external USB drive. The current fix (as of BIOS 3.03) is to re-enable both Quick boot and Quiet boot. Presumably this is something that will get fixed in a future BIOS update. Note that the following keybindings are active in the BIOS POST check:
Key Meaning
F2 Enter BIOS setup menu
F12 Enter BIOS boot manager
Delete Enter BIOS setup menu

WiFi compatibility issues I couldn't make WiFi work at first. Obviously, the default Debian installer doesn't ship with proprietary firmware (although that might change soon) so the WiFi card didn't work out of the box. But even after copying the firmware through a USB stick, I couldn't quite manage to find the right combination of ip/iw/wpa-supplicant (yes, after repeatedly copying a bunch more packages over to get those bootstrapped). (Next time I should probably try something like this post.) Thankfully, I had a little USB-C dongle with a RJ-45 jack lying around. That also required a firmware blob, but it was a single package to copy over, and with that loaded, I had network. Eventually, I did managed to make WiFi work; the problem was more on the side of "I forgot how to configure a WPA network by hand from the commandline" than anything else. NetworkManager worked fine and got WiFi working correctly. Note that this is with Debian bookworm, which has the 5.19 Linux kernel, and with the firmware-nonfree (firmware-iwlwifi, specifically) package.

Battery life I was having between about 7 hours of battery on the Purism Librem 13v4, and that's after a year or two of battery life. Now, I still have about 7 hours of battery life, which is nicer than my old ThinkPad X220 (20 minutes!) but really, it's not that good for a new generation laptop. The 12th generation Intel chipset probably improved things compared to the previous one Framework laptop, but I don't have a 11th gen Framework to compare with). (Note that those are estimates from my status bar, not wall clock measurements. They should still be comparable between the Purism and Framework, that said.) The battery life doesn't seem up to, say, Dell XPS 13, ThinkPad X1, and of course not the Apple M1, where I would expect 10+ hours of battery life out of the box. That said, I do get those kind estimates when the machine is fully charged and idle. In fact, when everything is quiet and nothing is plugged in, I get dozens of hours of battery life estimated (I've seen 25h!). So power usage fluctuates quite a bit depending on usage, which I guess is expected. Concretely, so far, light web browsing, reading emails and writing notes in Emacs (e.g. this file) takes about 8W of power:
Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Average   1.7   0.0   0.5  97.6   0.2  1.2 4684.9 1985.2 126.6 39.1 128.0   7.57
 GeoMean   1.4   0.0   0.4  97.6   0.1  1.2 4416.6 1734.5 111.6 27.9 113.3   7.54
  StdDev   1.0   0.2   0.2   1.2   0.0  0.5 1584.7 1058.3 82.1 44.0 80.2   0.71
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
 Minimum   0.2   0.0   0.2  94.9   0.1  1.0 2242.0  698.2 82.0 17.0 82.0   6.36
 Maximum   4.1   1.1   1.0  99.4   0.2  3.0 8687.4 4445.1 463.0 249.0 449.0   9.10
-------- ----- ----- ----- ----- ----- ---- ------ ------ ---- ---- ---- ------
Summary:
System:   7.57 Watts on average with standard deviation 0.71
Expansion cards matter a lot in the battery life (see below for a thorough discussion), my normal setup is 2xUSB-C and 1xUSB-A (yes, with an empty slot, and yes, to save power). Interestingly, playing a video in a (720p) window in a window takes up more power (10.5W) than in full screen (9.5W) but I blame that on my desktop setup (i3 + compton)... Not sure if mpv hits the VA-API, maybe not in windowed mode. Similar results with 1080p, interestingly, except the window struggles to keep up altogether. Full screen playback takes a relatively comfortable 9.5W, which means a solid 5h+ of playback, which is fine by me. Fooling around the web, small edits, youtube-dl, and I'm at around 80% battery after about an hour, with an estimated 5h left, which is a little disappointing. I had a 7h remaining estimate before I started goofing around Discourse, so I suspect the website is a pretty big battery drain, actually. I see about 10-12 W, while I was probably at half that (6-8W) just playing music with mpv in the background... In other words, it looks like editing posts in Discourse with Firefox takes a solid 4-6W of power. Amazing and gross. (When writing about abusive power usage generates more power usage, is that an heisenbug? Or schr dinbug?)

Power management Compared to the Purism Librem 13v4, the ongoing power usage seems to be slightly better. An anecdotal metric is that the Purism would take 800mA idle, while the more powerful Framework manages a little over 500mA as I'm typing this, fluctuating between 450 and 600mA. That is without any active expansion card, except the storage. Those numbers come from the output of tlp-stat -b and, unfortunately, the "ampere" unit makes it quite hard to compare those, because voltage is not necessarily the same between the two platforms.
  • TODO: review Arch Linux's tips on power saving
  • TODO: i915 driver has a lot of parameters, including some about power saving, see, again, the arch wiki, and particularly enable_fbc=1
TL:DR; power management on the laptop is an issue, but there's various tweaks you can make to improve it. Try:
  • powertop --auto-tune
  • apt install tlp && systemctl enable tlp
  • nvme.noacpi=1 mem_sleep_default=deep on the kernel command line may help with standby power usage
  • keep only USB-C expansion cards plugged in, all others suck power even when idle
  • consider upgrading the BIOS to latest beta (3.06 at the time of writing), unverified power savings
  • latest Linux kernels (6.2) promise power savings as well (unverified)
Update: also try to follow the official optimization guide. It was made for Ubuntu but will probably also work for your distribution of choice with a few tweaks. They recommend using tlpui but it's not packaged in Debian. There is, however, a Flatpak release. In my case, it resulted in the following diff to tlp.conf: tlp.patch.

Background on CPU architecture There were power problems in the 11th gen Framework laptop, according to this report from Linux After Dark, so the issues with power management on the Framework are not new. The 12th generation Intel CPU (AKA "Alder Lake") is a big-little architecture with "power-saving" and "performance" cores. There used to be performance problems introduced by the scheduler in Linux 5.16 but those were eventually fixed in 5.18, which uses Intel's hardware as an "intelligent, low-latency hardware-assisted scheduler". According to Phoronix, the 5.19 release improved the power saving, at the cost of some penalty cost. There were also patch series to make the scheduler configurable, but it doesn't look those have been merged as of 5.19. There was also a session about this at the 2022 Linux Plumbers, but they stopped short of talking more about the specific problems Linux is facing in Alder lake:
Specifically, the kernel's energy-aware scheduling heuristics don't work well on those CPUs. A number of features present there complicate the energy picture; these include SMT, Intel's "turbo boost" mode, and the CPU's internal power-management mechanisms. For many workloads, running on an ostensibly more power-hungry Pcore can be more efficient than using an Ecore. Time for discussion of the problem was lacking, though, and the session came to a close.
All this to say that the 12gen Intel line shipped with this Framework series should have better power management thanks to its power-saving cores. And Linux has had the scheduler changes to make use of this (but maybe is still having trouble). In any case, this might not be the source of power management problems on my laptop, quite the opposite. Also note that the firmware updates for various chipsets are supposed to improve things eventually. On the other hand, The Verge simply declared the whole P-series a mistake...

Attempts at improving power usage I did try to follow some of the tips in this forum post. The tricks powertop --auto-tune and tlp's PCIE_ASPM_ON_BAT=powersupersave basically did nothing: I was stuck at 10W power usage in powertop (600+mA in tlp-stat). Apparently, I should be able to reach the C8 CPU power state (or even C9, C10) in powertop, but I seem to be stock at C7. (Although I'm not sure how to read that tab in powertop: in the Core(HW) column there's only C3/C6/C7 states, and most cores are 85% in C7 or maybe C6. But the next column over does show many CPUs in C10 states... As it turns out, the graphics card actually takes up a good chunk of power unless proper power management is enabled (see below). After tweaking this, I did manage to get down to around 7W power usage in powertop. Expansion cards actually do take up power, and so does the screen, obviously. The fully-lit screen takes a solid 2-3W of power compared to the fully dimmed screen. When removing all expansion cards and making the laptop idle, I can spin it down to 4 watts power usage at the moment, and an amazing 2 watts when the screen turned off.

Caveats Abusive (10W+) power usage that I initially found could be a problem with my desktop configuration: I have this silly status bar that updates every second and probably causes redraws... The CPU certainly doesn't seem to spin down below 1GHz. Also note that this is with an actual desktop running with everything: it could very well be that some things (I'm looking at you Signal Desktop) take up unreasonable amount of power on their own (hello, 1W/electron, sheesh). Syncthing and containerd (Docker!) also seem to take a good 500mW just sitting there. Beyond my desktop configuration, this could, of course, be a Debian-specific problem; your favorite distribution might be better at power management.

Idle power usage tests Some expansion cards waste energy, even when unused. Here is a summary of the findings from the powerstat page. I also include other devices tested in this page for completeness:
Device Minimum Average Max Stdev Note
Screen, 100% 2.4W 2.6W 2.8W N/A
Screen, 1% 30mW 140mW 250mW N/A
Backlight 1 290mW ? ? ? fairly small, all things considered
Backlight 2 890mW 1.2W 3W? 460mW? geometric progression
Backlight 3 1.69W 1.5W 1.8W? 390mW? significant power use
Radios 100mW 250mW N/A N/A
USB-C N/A N/A N/A N/A negligible power drain
USB-A 10mW 10mW ? 10mW almost negligible
DisplayPort 300mW 390mW 600mW N/A not passive
HDMI 380mW 440mW 1W? 20mW not passive
1TB SSD 1.65W 1.79W 2W 12mW significant, probably higher when busy
MicroSD 1.6W 3W 6W 1.93W highest power usage, possibly even higher when busy
Ethernet 1.69W 1.64W 1.76W N/A comparable to the SSD card
So it looks like all expansion cards but the USB-C ones are active, i.e. they draw power with idle. The USB-A cards are the least concern, sucking out 10mW, pretty much within the margin of error. But both the DisplayPort and HDMI do take a few hundred miliwatts. It looks like USB-A connectors have this fundamental flaw that they necessarily draw some powers because they lack the power negotiation features of USB-C. At least according to this post:
It seems the USB A must have power going to it all the time, that the old USB 2 and 3 protocols, the USB C only provides power when there is a connection. Old versus new.
Apparently, this is a problem specific to the USB-C to USB-A adapter that ships with the Framework. Some people have actually changed their orders to all USB-C because of this problem, but I'm not sure the problem is as serious as claimed in the forums. I couldn't reproduce the "one watt" power drains suggested elsewhere, at least not repeatedly. (A previous version of this post did show such a power drain, but it was in a less controlled test environment than the series of more rigorous tests above.) The worst offenders are the storage cards: the SSD drive takes at least one watt of power and the MicroSD card seems to want to take all the way up to 6 watts of power, both just sitting there doing nothing. This confirms claims of 1.4W for the SSD (but not 5W) power usage found elsewhere. The former post has instructions on how to disable the card in software. The MicroSD card has been reported as using 2 watts, but I've seen it as high as 6 watts, which is pretty damning. The Framework team has a beta update for the DisplayPort adapter but currently only for Windows (LVFS technically possible, "under investigation"). A USB-A firmware update is also under investigation. It is therefore likely at least some of those power management issues will eventually be fixed. Note that the upcoming Ethernet card has a reported 2-8W power usage, depending on traffic. I did my own power usage tests in powerstat-wayland and they seem lower than 2W. The upcoming 6.2 Linux kernel might also improve battery usage when idle, see this Phoronix article for details, likely in early 2023.

Idle power usage tests under Wayland Update: I redid those tests under Wayland, see powerstat-wayland for details. The TL;DR: is that power consumption is either smaller or similar.

Idle power usage tests, 3.06 beta BIOS I redid the idle tests after the 3.06 beta BIOS update and ended up with this results:
Device Minimum Average Max Stdev Note
Baseline 1.96W 2.01W 2.11W 30mW 1 USB-C, screen off, backlight off, no radios
2 USB-C 1.95W 2.16W 3.69W 430mW USB-C confirmed as mostly passive...
3 USB-C 1.95W 2.16W 3.69W 430mW ... although with extra stdev
1TB SSD 3.72W 3.85W 4.62W 200mW unchanged from before upgrade
1 USB-A 1.97W 2.18W 4.02W 530mW unchanged
2 USB-A 1.97W 2.00W 2.08W 30mW unchanged
3 USB-A 1.94W 1.99W 2.03W 20mW unchanged
MicroSD w/o card 3.54W 3.58W 3.71W 40mW significant improvement! 2-3W power saving!
MicroSD w/ card 3.53W 3.72W 5.23W 370mW new measurement! increased deviation
DisplayPort 2.28W 2.31W 2.37W 20mW unchanged
1 HDMI 2.43W 2.69W 4.53W 460mW unchanged
2 HDMI 2.53W 2.59W 2.67W 30mW unchanged
External USB 3.85W 3.89W 3.94W 30mW new result
Ethernet 3.60W 3.70W 4.91W 230mW unchanged
Note that the table summary is different than the previous table: here we show the absolute numbers while the previous table was doing a confusing attempt at showing relative (to the baseline) numbers. Conclusion: the 3.06 BIOS update did not significantly change idle power usage stats except for the MicroSD card which has significantly improved. The new "external USB" test is also interesting: it shows how the provided 1TB SSD card performs (admirably) compared to existing devices. The other new result is the MicroSD card with a card which, interestingly, uses less power than the 1TB SSD drive.

Standby battery usage I wrote some quick hack to evaluate how much power is used during sleep. Apparently, this is one of the areas that should have improved since the first Framework model, let's find out. My baseline for comparison is the Purism laptop, which, in 10 minutes, went from this:
sep 28 11:19:45 angela systemd-sleep[209379]: /sys/class/power_supply/BAT/charge_now                      =   6045 [mAh]
... to this:
sep 28 11:29:47 angela systemd-sleep[209725]: /sys/class/power_supply/BAT/charge_now                      =   6037 [mAh]
That's 8mAh per 10 minutes (and 2 seconds), or 48mA, or, with this battery, about 127 hours or roughly 5 days of standby. Not bad! In comparison, here is my really old x220, before:
sep 29 22:13:54 emma systemd-sleep[176315]: /sys/class/power_supply/BAT0/energy_now                     =   5070 [mWh]
... after:
sep 29 22:23:54 emma systemd-sleep[176486]: /sys/class/power_supply/BAT0/energy_now                     =   4980 [mWh]
... which is 90 mwH in 10 minutes, or a whopping 540mA, which was possibly okay when this battery was new (62000 mAh, so about 100 hours, or about 5 days), but this battery is almost dead and has only 5210 mAh when full, so only 10 hours standby. And here is the Framework performing a similar test, before:
sep 29 22:27:04 angela systemd-sleep[4515]: /sys/class/power_supply/BAT1/charge_full                    =   3518 [mAh]
sep 29 22:27:04 angela systemd-sleep[4515]: /sys/class/power_supply/BAT1/charge_now                     =   2861 [mAh]
... after:
sep 29 22:37:08 angela systemd-sleep[4743]: /sys/class/power_supply/BAT1/charge_now                     =   2812 [mAh]
... which is 49mAh in a little over 10 minutes (and 4 seconds), or 292mA, much more than the Purism, but half of the X220. At this rate, the battery would last on standby only 12 hours!! That is pretty bad. Note that this was done with the following expansion cards:
  • 2 USB-C
  • 1 1TB SSD drive
  • 1 USB-A with a hub connected to it, with keyboard and LAN
Preliminary tests without the hub (over one minute) show that it doesn't significantly affect this power consumption (300mA). This guide also suggests booting with nvme.noacpi=1 but this still gives me about 5mAh/min (or 300mA). Adding mem_sleep_default=deep to the kernel command line does make a difference. Before:
sep 29 23:03:11 angela systemd-sleep[3699]: /sys/class/power_supply/BAT1/charge_now                     =   2544 [mAh]
... after:
sep 29 23:04:25 angela systemd-sleep[4039]: /sys/class/power_supply/BAT1/charge_now                     =   2542 [mAh]
... which is 2mAh in 74 seconds, which is 97mA, brings us to a more reasonable 36 hours, or a day and a half. It's still above the x220 power usage, and more than an order of magnitude more than the Purism laptop. It's also far from the 0.4% promised by upstream, which would be 14mA for the 3500mAh battery. It should also be noted that this "deep" sleep mode is a little more disruptive than regular sleep. As you can see by the timing, it took more than 10 seconds for the laptop to resume, which feels a little alarming as your banging the keyboard to bring it back to life. You can confirm the current sleep mode with:
# cat /sys/power/mem_sleep
s2idle [deep]
In the above, deep is selected. You can change it on the fly with:
printf s2idle > /sys/power/mem_sleep
Here's another test:
sep 30 22:25:50 angela systemd-sleep[32207]: /sys/class/power_supply/BAT1/charge_now                     =   1619 [mAh]
sep 30 22:31:30 angela systemd-sleep[32516]: /sys/class/power_supply/BAT1/charge_now                     =   1613 [mAh]
... better! 6 mAh in about 6 minutes, works out to 63.5mA, so more than two days standby. A longer test:
oct 01 09:22:56 angela systemd-sleep[62978]: /sys/class/power_supply/BAT1/charge_now                     =   3327 [mAh]
oct 01 12:47:35 angela systemd-sleep[63219]: /sys/class/power_supply/BAT1/charge_now                     =   3147 [mAh]
That's 180mAh in about 3.5h, 52mA! Now at 66h, or almost 3 days. I wasn't sure why I was seeing such fluctuations in those tests, but as it turns out, expansion card power tests show that they do significantly affect power usage, especially the SSD drive, which can take up to two full watts of power even when idle. I didn't control for expansion cards in the above tests running them with whatever card I had plugged in without paying attention so it's likely the cause of the high power usage and fluctuations. It might be possible to work around this problem by disabling USB devices before suspend. TODO. See also this post. In the meantime, I have been able to get much better suspend performance by unplugging all modules. Then I get this result:
oct 04 11:15:38 angela systemd-sleep[257571]: /sys/class/power_supply/BAT1/charge_now                     =   3203 [mAh]
oct 04 15:09:32 angela systemd-sleep[257866]: /sys/class/power_supply/BAT1/charge_now                     =   3145 [mAh]
Which is 14.8mA! Almost exactly the number promised by Framework! With a full battery, that means a 10 days suspend time. This is actually pretty good, and far beyond what I was expecting when starting down this journey. So, once the expansion cards are unplugged, suspend power usage is actually quite reasonable. More detailed standby tests are available in the standby-tests page, with a summary below. There is also some hope that the Chromebook edition specifically designed with a specification of 14 days standby time could bring some firmware improvements back down to the normal line. Some of those issues were reported upstream in April 2022, but there doesn't seem to have been any progress there since. TODO: one final solution here is suspend-then-hibernate, which Windows uses for this TODO: consider implementing the S0ix sleep states , see also troubleshooting TODO: consider https://github.com/intel/pm-graph

Standby expansion cards test results This table is a summary of the more extensive standby-tests I have performed:
Device Wattage Amperage Days Note
baseline 0.25W 16mA 9 sleep=deep nvme.noacpi=1
s2idle 0.29W 18.9mA ~7 sleep=s2idle nvme.noacpi=1
normal nvme 0.31W 20mA ~7 sleep=s2idle without nvme.noacpi=1
1 USB-C 0.23W 15mA ~10
2 USB-C 0.23W 14.9mA same as above
1 USB-A 0.75W 48.7mA 3 +500mW (!!) for the first USB-A card!
2 USB-A 1.11W 72mA 2 +360mW
3 USB-A 1.48W 96mA <2 +370mW
1TB SSD 0.49W 32mA <5 +260mW
MicroSD 0.52W 34mA ~4 +290mW
DisplayPort 0.85W 55mA <3 +620mW (!!)
1 HDMI 0.58W 38mA ~4 +250mW
2 HDMI 0.65W 42mA <4 +70mW (?)
Conclusions:
  • USB-C cards take no extra power on suspend, possibly less than empty slots, more testing required
  • USB-A cards take a lot more power on suspend (300-500mW) than on regular idle (~10mW, almost negligible)
  • 1TB SSD and MicroSD cards seem to take a reasonable amount of power (260-290mW), compared to their runtime equivalents (1-6W!)
  • DisplayPort takes a surprising lot of power (620mW), almost double its average runtime usage (390mW)
  • HDMI cards take, surprisingly, less power (250mW) in standby than the DP card (620mW)
  • and oddly, a second card adds less power usage (70mW?!) than the first, maybe a circuit is used by both?
A discussion of those results is in this forum post.

Standby expansion cards test results, 3.06 beta BIOS Framework recently (2022-11-07) announced that they will publish a firmware upgrade to address some of the USB-C issues, including power management. This could positively affect the above result, improving both standby and runtime power usage. The update came out in December 2022 and I redid my analysis with the following results:
Device Wattage Amperage Days Note
baseline 0.25W 16mA 9 no cards, same as before upgrade
1 USB-C 0.25W 16mA 9 same as before
2 USB-C 0.25W 16mA 9 same
1 USB-A 0.80W 62mA 3 +550mW!! worse than before
2 USB-A 1.12W 73mA <2 +320mW, on top of the above, bad!
Ethernet 0.62W 40mA 3-4 new result, decent
1TB SSD 0.52W 34mA 4 a bit worse than before (+2mA)
MicroSD 0.51W 22mA 4 same
DisplayPort 0.52W 34mA 4+ upgrade improved by 300mW
1 HDMI ? 38mA ? same
2 HDMI ? 45mA ? a bit worse than before (+3mA)
Normal 1.08W 70mA ~2 Ethernet, 2 USB-C, USB-A
Full results in standby-tests-306. The big takeaway for me is that the update did not improve power usage on the USB-A ports which is a big problem for my use case. There is a notable improvement on the DisplayPort power consumption which brings it more in line with the HDMI connector, but it still doesn't properly turn off on suspend either. Even worse, the USB-A ports now sometimes fails to resume after suspend, which is pretty annoying. This is a known problem that will hopefully get fixed in the final release.

Battery wear protection The BIOS has an option to limit charge to 80% to mitigate battery wear. There's a way to control the embedded controller from runtime with fw-ectool, partly documented here. The command would be:
sudo ectool fwchargelimit 80
I looked at building this myself but failed to run it. I opened a RFP in Debian so that we can ship this in Debian, and also documented my work there. Note that there is now a counter that tracks charge/discharge cycles. It's visible in tlp-stat -b, which is a nice improvement:
root@angela:/home/anarcat# tlp-stat -b
--- TLP 1.5.0 --------------------------------------------
+++ Battery Care
Plugin: generic
Supported features: none available
+++ Battery Status: BAT1
/sys/class/power_supply/BAT1/manufacturer                   = NVT
/sys/class/power_supply/BAT1/model_name                     = Framewo
/sys/class/power_supply/BAT1/cycle_count                    =      3
/sys/class/power_supply/BAT1/charge_full_design             =   3572 [mAh]
/sys/class/power_supply/BAT1/charge_full                    =   3541 [mAh]
/sys/class/power_supply/BAT1/charge_now                     =   1625 [mAh]
/sys/class/power_supply/BAT1/current_now                    =    178 [mA]
/sys/class/power_supply/BAT1/status                         = Discharging
/sys/class/power_supply/BAT1/charge_control_start_threshold = (not available)
/sys/class/power_supply/BAT1/charge_control_end_threshold   = (not available)
Charge                                                      =   45.9 [%]
Capacity                                                    =   99.1 [%]
One thing that is still missing is the charge threshold data (the (not available) above). There's been some work to make that accessible in August, stay tuned? This would also make it possible implement hysteresis support.

Ethernet expansion card The Framework ethernet expansion card is a fancy little doodle: "2.5Gbit/s and 10/100/1000Mbit/s Ethernet", the "clear housing lets you peek at the RTL8156 controller that powers it". Which is another way to say "we didn't completely finish prod on this one, so it kind of looks like we 3D-printed this in the shop".... The card is a little bulky, but I guess that's inevitable considering the RJ-45 form factor when compared to the thin Framework laptop. I have had a serious issue when trying it at first: the link LEDs just wouldn't come up. I made a full bug report in the forum and with upstream support, but eventually figured it out on my own. It's (of course) a power saving issue: if you reboot the machine, the links come up when the laptop is running the BIOS POST check and even when the Linux kernel boots. I first thought that the problem is likely related to the powertop service which I run at boot time to tweak some power saving settings. It seems like this:
echo 'on' > '/sys/bus/usb/devices/4-2/power/control'
... is a good workaround to bring the card back online. You can even return to power saving mode and the card will still work:
echo 'auto' > '/sys/bus/usb/devices/4-2/power/control'
Further research by Matt_Hartley from the Framework Team found this issue in the tlp tracker that shows how the USB_AUTOSUSPEND setting enables the power saving even if the driver doesn't support it, which, in retrospect, just sounds like a bad idea. To quote that issue:
By default, USB power saving is active in the kernel, but not force-enabled for incompatible drivers. That is, devices that support suspension will suspend, drivers that do not, will not.
So the fix is actually to uninstall tlp or disable that setting by adding this to /etc/tlp.conf:
USB_AUTOSUSPEND=0
... but that disables auto-suspend on all USB devices, which may hurt other power usage performance. I have found that a a combination of:
USB_AUTOSUSPEND=1
USB_DENYLIST="0bda:8156"
and this on the kernel commandline:
usbcore.quirks=0bda:8156:k
... actually does work correctly. I now have this in my /etc/default/grub.d/framework-tweaks.cfg file:
# net.ifnames=0: normal interface names ffs (e.g. eth0, wlan0, not wlp166
s0)
# nvme.noacpi=1: reduce SSD disk power usage (not working)
# mem_sleep_default=deep: reduce power usage during sleep (not working)
# usbcore.quirk is a workaround for the ethernet card suspend bug: https:
//guides.frame.work/Guide/Fedora+37+Installation+on+the+Framework+Laptop/
108?lang=en
GRUB_CMDLINE_LINUX="net.ifnames=0 nvme.noacpi=1 mem_sleep_default=deep usbcore.quirks=0bda:8156:k"
# fix the resolution in grub for fonts to not be tiny
GRUB_GFXMODE=1024x768
Other than that, I haven't been able to max out the card because I don't have other 2.5Gbit/s equipment at home, which is strangely satisfying. But running against my Turris Omnia router, I could pretty much max a gigabit fairly easily:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.09 GBytes   937 Mbits/sec  238             sender
[  5]   0.00-10.00  sec  1.09 GBytes   934 Mbits/sec                  receiver
The card doesn't require any proprietary firmware blobs which is surprising. Other than the power saving issues, it just works. In my power tests (see powerstat-wayland), the Ethernet card seems to use about 1.6W of power idle, without link, in the above "quirky" configuration where the card is functional but without autosuspend.

Proprietary firmware blobs The framework does need proprietary firmware to operate. Specifically:
  • the WiFi network card shipped with the DIY kit is a AX210 card that requires a 5.19 kernel or later, and the firmware-iwlwifi non-free firmware package
  • the Bluetooth adapter also loads the firmware-iwlwifi package (untested)
  • the graphics work out of the box without firmware, but certain power management features come only with special proprietary firmware, normally shipped in the firmware-misc-nonfree but currently missing from the package
Note that, at the time of writing, the latest i915 firmware from linux-firmware has a serious bug where loading all the accessible firmware results in noticeable I estimate 200-500ms lag between the keyboard (not the mouse!) and the display. Symptoms also include tearing and shearing of windows, it's pretty nasty. One workaround is to delete the two affected firmware files:
cd /lib/firmware && rm adlp_guc_70.1.1.bin adlp_guc_69.0.3.bin
update-initramfs -u
You will get the following warning during build, which is good as it means the problematic firmware is disabled:
W: Possible missing firmware /lib/firmware/i915/adlp_guc_69.0.3.bin for module i915
W: Possible missing firmware /lib/firmware/i915/adlp_guc_70.1.1.bin for module i915
But then it also means that critical firmware isn't loaded, which means, among other things, a higher battery drain. I was able to move from 8.5-10W down to the 7W range after making the firmware work properly. This is also after turning the backlight all the way down, as that takes a solid 2-3W in full blast. The proper fix is to use some compositing manager. I ended up using compton with the following systemd unit:
[Unit]
Description=start compositing manager
PartOf=graphical-session.target
ConditionHost=angela
[Service]
Type=exec
ExecStart=compton --show-all-xerrors --backend glx --vsync opengl-swc
Restart=on-failure
[Install]
RequiredBy=graphical-session.target
compton is orphaned however, so you might be tempted to use picom instead, but in my experience the latter uses much more power (1-2W extra, similar experience). I also tried compiz but it would just crash with:
anarcat@angela:~$ compiz --replace
compiz (core) - Warn: No XI2 extension
compiz (core) - Error: Another composite manager is already running on screen: 0
compiz (core) - Fatal: No manageable screens found on display :0
When running from the base session, I would get this instead:
compiz (core) - Warn: No XI2 extension
compiz (core) - Error: Couldn't load plugin 'ccp'
compiz (core) - Error: Couldn't load plugin 'ccp'
Thanks to EmanueleRocca for figuring all that out. See also this discussion about power management on the Framework forum. Note that Wayland environments do not require any special configuration here and actually work better, see my Wayland migration notes for details.
Also note that the iwlwifi firmware also looks incomplete. Even with the package installed, I get those errors in dmesg:
[   19.534429] Intel(R) Wireless WiFi driver for Linux
[   19.534691] iwlwifi 0000:a6:00.0: enabling device (0000 -> 0002)
[   19.541867] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-72.ucode (-2)
[   19.541881] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-72.ucode (-2)
[   19.541882] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-72.ucode failed with error -2
[   19.541890] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-71.ucode (-2)
[   19.541895] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-71.ucode (-2)
[   19.541896] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-71.ucode failed with error -2
[   19.541903] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-70.ucode (-2)
[   19.541907] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-70.ucode (-2)
[   19.541908] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-70.ucode failed with error -2
[   19.541913] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-69.ucode (-2)
[   19.541916] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-69.ucode (-2)
[   19.541917] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-69.ucode failed with error -2
[   19.541922] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-68.ucode (-2)
[   19.541926] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-68.ucode (-2)
[   19.541927] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-68.ucode failed with error -2
[   19.541933] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-67.ucode (-2)
[   19.541937] iwlwifi 0000:a6:00.0: firmware: failed to load iwlwifi-ty-a0-gf-a0-67.ucode (-2)
[   19.541937] iwlwifi 0000:a6:00.0: Direct firmware load for iwlwifi-ty-a0-gf-a0-67.ucode failed with error -2
[   19.544244] iwlwifi 0000:a6:00.0: firmware: direct-loading firmware iwlwifi-ty-a0-gf-a0-66.ucode
[   19.544257] iwlwifi 0000:a6:00.0: api flags index 2 larger than supported by driver
[   19.544270] iwlwifi 0000:a6:00.0: TLV_FW_FSEQ_VERSION: FSEQ Version: 0.63.2.1
[   19.544523] iwlwifi 0000:a6:00.0: firmware: failed to load iwl-debug-yoyo.bin (-2)
[   19.544528] iwlwifi 0000:a6:00.0: firmware: failed to load iwl-debug-yoyo.bin (-2)
[   19.544530] iwlwifi 0000:a6:00.0: loaded firmware version 66.55c64978.0 ty-a0-gf-a0-66.ucode op_mode iwlmvm
Some of those are available in the latest upstream firmware package (iwlwifi-ty-a0-gf-a0-71.ucode, -68, and -67), but not all (e.g. iwlwifi-ty-a0-gf-a0-72.ucode is missing) . It's unclear what those do or don't, as the WiFi seems to work well without them. I still copied them in from the latest linux-firmware package in the hope they would help with power management, but I did not notice a change after loading them. There are also multiple knobs on the iwlwifi and iwlmvm drivers. The latter has a power_schmeme setting which defaults to 2 (balanced), setting it to 3 (low power) could improve battery usage as well, in theory. The iwlwifi driver also has power_save (defaults to disabled) and power_level (1-5, defaults to 1) settings. See also the output of modinfo iwlwifi and modinfo iwlmvm for other driver options.

Graphics acceleration After loading the latest upstream firmware and setting up a compositing manager (compton, above), I tested the classic glxgears. Running in a window gives me odd results, as the gears basically grind to a halt:
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
137 frames in 5.1 seconds = 26.984 FPS
27 frames in 5.4 seconds =  5.022 FPS
Ouch. 5FPS! But interestingly, once the window is in full screen, it does hit the monitor refresh rate:
300 frames in 5.0 seconds = 60.000 FPS
I'm not really a gamer and I'm not normally using any of that fancy graphics acceleration stuff (except maybe my browser does?). I installed intel-gpu-tools for the intel_gpu_top command to confirm the GPU was engaged when doing those simulations. A nice find. Other useful diagnostic tools include glxgears and glxinfo (in mesa-utils) and (vainfo in vainfo). Following to this post, I also made sure to have those settings in my about:config in Firefox, or, in user.js:
user_pref("media.ffmpeg.vaapi.enabled", true);
Note that the guide suggests many other settings to tweak, but those might actually be overkill, see this comment and its parents. I did try forcing hardware acceleration by setting gfx.webrender.all to true, but everything became choppy and weird. The guide also mentions installing the intel-media-driver package, but I could not find that in Debian. The Arch wiki has, as usual, an excellent reference on hardware acceleration in Firefox.

Chromium / Signal desktop bugs It looks like both Chromium and Signal Desktop misbehave with my compositor setup (compton + i3). The fix is to add a persistent flag to Chromium. In Arch, it's conveniently in ~/.config/chromium-flags.conf but that doesn't actually work in Debian. I had to put the flag in /etc/chromium.d/disable-compositing, like this:
export CHROMIUM_FLAGS="$CHROMIUM_FLAGS --disable-gpu-compositing"
It's possible another one of the hundreds of flags might fix this issue better, but I don't really have time to go through this entire, incomplete, and unofficial list (!?!). Signal Desktop is a similar problem, and doesn't reuse those flags (because of course it doesn't). Instead I had to rewrite the wrapper script in /usr/local/bin/signal-desktop to use this instead:
exec /usr/bin/flatpak run --branch=stable --arch=x86_64 org.signal.Signal --disable-gpu-compositing "$@"
This was mostly done in this Puppet commit. I haven't figured out the root of this problem. I did try using picom and xcompmgr; they both suffer from the same issue. Another Debian testing user on Wayland told me they haven't seen this problem, so hopefully this can be fixed by switching to wayland.

Graphics card hangs I believe I might have this bug which results in a total graphical hang for 15-30 seconds. It's fairly rare so it's not too disruptive, but when it does happen, it's pretty alarming. The comments on that bug report are encouraging though: it seems this is a bug in either mesa or the Intel graphics driver, which means many people have this problem so it's likely to be fixed. There's actually a merge request on mesa already (2022-12-29). It could also be that bug because the error message I get is actually:
Jan 20 12:49:10 angela kernel: Asynchronous wait on fence 0000:00:02.0:sway[104431]:cb0ae timed out (hint:intel_atomic_commit_ready [i915]) 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on rcs0 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC firmware i915/adlp_guc_70.1.1.bin version 70.1 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] HuC authenticated 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC submission enabled 
Jan 20 12:49:15 angela kernel: i915 0000:00:02.0: [drm] GuC SLPC enabled
It's a solid 30 seconds graphical hang. Maybe the keyboard and everything else keeps working. The latter bug report is quite long, with many comments, but this one from January 2023 seems to say that Sway 1.8 fixed the problem. There's also an earlier patch to add an extra kernel parameter that supposedly fixes that too. There's all sorts of other workarounds in there, for example this:
echo "options i915 enable_dc=1 enable_guc_loading=1 enable_guc_submission=1 edp_vswing=0 enable_guc=2 enable_fbc=1 enable_psr=1 disable_power_well=0"   sudo tee /etc/modprobe.d/i915.conf
from this comment... So that one is unsolved, as far as the upstream drivers are concerned, but maybe could be fixed through Sway.

Weird USB hangs / graphical glitches I have had weird connectivity glitches better described in this post, but basically: my USB keyboard and mice (connected over a USB hub) drop keys, lag a lot or hang, and I get visual glitches. The fix was to tighten the screws around the CPU on the motherboard (!), which is, thankfully, a rather simple repair.

USB docks are hell Note that the monitors are hooked up to angela through a USB-C / Thunderbolt dock from Cable Matters, with the lovely name of 201053-SIL. It has issues, see this blog post for an in-depth discussion.

Shipping details I ordered the Framework in August 2022 and received it about a month later, which is sooner than expected because the August batch was late. People (including me) expected this to have an impact on the September batch, but it seems Framework have been able to fix the delivery problems and keep up with the demand. As of early 2023, their website announces that laptops ship "within 5 days". I have myself ordered a few expansion cards in November 2022, and they shipped on the same day, arriving 3-4 days later.

The supply pipeline There are basically 6 steps in the Framework shipping pipeline, each (except the last) accompanied with an email notification:
  1. pre-order
  2. preparing batch
  3. preparing order
  4. payment complete
  5. shipping
  6. (received)
This comes from the crowdsourced spreadsheet, which should be updated when the status changes here. I was part of the "third batch" of the 12th generation laptop, which was supposed to ship in September. It ended up arriving on my door step on September 27th, about 33 days after ordering. It seems current orders are not processed in "batches", but in real time, see this blog post for details on shipping.

Shipping trivia I don't know about the others, but my laptop shipped through no less than four different airplane flights. Here are the hops it took: I can't quite figure out how to calculate exactly how much mileage that is, but it's huge. The ride through Alaska is surprising enough but the bounce back through Winnipeg is especially weird. I guess the route happens that way because of Fedex shipping hubs. There was a related oddity when I had my Purism laptop shipped: it left from the west coast and seemed to enter on an endless, two week long road trip across the continental US.

Other resources

15 February 2023

Lukas M rdian: Netplan v0.106 is now available

I m happy to announce that Netplan version 0.106 is now available on GitHub and is soon to be deployed into an Ubuntu/Debian/Fedora installation near you! Six months and 65 commits after the previous version, this release is brought to you by 4 free software contributors from around the globe. Highlights Highlights of this release include the new netplan status command, which queries your system for IP addresses, routes, DNS information, etc in addition to the Netplan backend renderer (NetworkManager/networkd) in use and the relevant Netplan YAML configuration ID. It displays all this in a nicely formatted way (or alternatively in machine readable YAML/JSON format).
Furthermore, we implemented a clean libnetplan API which can be used by external tools to parse Netplan configuration, migrated away from non-inclusive language (PR#303) and improved the overall Netplan documentation. Another change that should be noted, is that the match.macaddress stanza now only matches on PermanentMACAddress= on the systemd-networkd backend, as has been the case on the NetworkManager backend ever since (see PR#278 for background information on this slight change in behavior). Changelog Bug fixes:

13 February 2023

Vincent Bernat: Building a SQL-like language to filter flows

Akvorado collects network flows using IPFIX or sFlow. It stores them in a ClickHouse database. A web console allows a user to query the data and plot some graphs. A nice aspect of this console is how we can filter flows with a SQL-like language:
Filter editor in Akvorado console
Often, web interfaces expose a query builder to build such filters. I think combining a SQL-like language with an editor supporting completion, syntax highlighting, and linting is a better approach.1 The language parser is built with pigeon (Go) from a parsing expression grammar or PEG. The editor component is CodeMirror (TypeScript).

Language parser PEG grammars are relatively recent2 and are an alternative to context-free grammars. They are easier to write and they can generate better error messages. Python switched from an LL(1)-based parser to a PEG-based parser in Python 3.9. pigeon generates a parser for Go. A grammar is a set of rules. Each rule is an identifier, with an optional user-friendly label for error messages, an expression, and an action in Go to be executed on match. You can find the complete grammar in parser.peg. Here is a simplified rule:
ConditionIPExpr "condition on IP"  
  column:("ExporterAddress"i   return "ExporterAddress", nil  
        / "SrcAddr"i   return "SrcAddr", nil  
        / "DstAddr"i   return "DstAddr", nil  ) _ 
  operator:("=" / "!=") _ 
  ip:IP  
    return fmt.Sprintf("%s %s IPv6StringToNum(%s)",
      toString(column), toString(operator), quote(ip)), nil
   
The rule identifier is ConditionIPExpr. It case-insensitively matches ExporterAddress, SrcAddr, or DstAddr. The action for each case returns the proper case for the column name. That s what is stored in the column variable. Then, it matches one of the possible operators. As there is no code block, it stores the matched string directly in the operator variable. Then, it tries to match the IP rule, which is defined elsewhere in the grammar. If it succeeds, it stores the result of the match in the ip variable and executes the final action. The action turns the column, operator, and IP into a proper expression for ClickHouse. For example, if we have ExporterAddress = 203.0.113.15, we get ExporterAddress = IPv6StringToNum('203.0.113.15'). The IP rule uses a rudimentary regular expression but checks if the matched address is correct in the action block, thanks to netip.ParseAddr():
IP "IP address"   [0-9A-Fa-f:.]+  
  ip, err := netip.ParseAddr(string(c.text))
  if err != nil  
    return "", errors.New("expecting an IP address")
   
  return ip.String(), nil
 
Our parser safely turns the filter into a WHERE clause accepted by ClickHouse:3
WHERE InIfBoundary = 'external' 
AND ExporterRegion = 'france' 
AND InIfConnectivity = 'transit' 
AND SrcAS = 15169 
AND DstAddr BETWEEN toIPv6('2a01:e0f:ffff::') 
                AND toIPv6('2a01:e0f:ffff:ffff:ffff:ffff:ffff:ffff')

Integration in CodeMirror CodeMirror is a versatile code editor that can be easily integrated into JavaScript projects. In Akvorado, the Vue.js component, InputFilter, uses CodeMirror as its foundation and leverages features such as syntax highlighting, linting, and completion. The source code for these capabilities can be found in the codemirror/lang-filter/ directory.

Syntax highlighting The PEG grammar for Go cannot be utilized directly4 and the requirements for parsers for editors are distinct: they should be error-tolerant and operate incrementally, as code is typically updated character by character. CodeMirror offers a solution through its own parser generator, Lezer. We don t need this additional parser to fully understand the filter language. Only the basic structure is needed: column names, comparison and logic operators, quoted and unquoted values. The grammar is therefore quite short and does not need to be updated often:
@top Filter  
  expression
 
expression  
 Not expression  
 "(" expression ")"  
 "(" expression ")" And expression  
 "(" expression ")" Or expression  
 comparisonExpression And expression  
 comparisonExpression Or expression  
 comparisonExpression
 
comparisonExpression  
 Column Operator Value
 
Value  
  String   Literal   ValueLParen ListOfValues ValueRParen
 
ListOfValues  
  ListOfValues ValueComma (String   Literal)  
  String   Literal
 
// [ ]
@tokens  
  // [ ]
  Column   std.asciiLetter (std.asciiLetter std.digit)*  
  Operator   $[a-zA-Z!=><]+  
  String  
    '"' (![\\\n"]   "\\" _)* '"'?  
    "'" (![\\\n']   "\\" _)* "'"?
   
  Literal   (std.digit   std.asciiLetter   $[.:/])+  
  // [ ]
 
The expression SrcAS = 12322 AND (DstAS = 1299 OR SrcAS = 29447) is parsed to:
Filter(Column, Operator, Value(Literal),
  And, Column, Operator, Value(Literal),
  Or, Column, Operator, Value(Literal))
The last step is to teach CodeMirror how to map each token to a highlighting tag:
export const FilterLanguage = LRLanguage.define( 
  parser: parser.configure( 
    props: [
      styleTags( 
        Column: t.propertyName,
        String: t.string,
        Literal: t.literal,
        LineComment: t.lineComment,
        BlockComment: t.blockComment,
        Or: t.logicOperator,
        And: t.logicOperator,
        Not: t.logicOperator,
        Operator: t.compareOperator,
        "( )": t.paren,
       ),
    ],
   ),
 );

Linting We offload linting to the original parser in Go. The /api/v0/console/filter/validate endpoint accepts a filter and returns a JSON structure with the errors that were found:
 
  "message": "at line 1, position 12: string literal not terminated",
  "errors": [ 
    "line":    1,
    "column":  12,
    "offset":  11,
    "message": "string literal not terminated",
   ]
 
The linter source for CodeMirror queries the API and turns each error into a diagnostic.

Completion The completion system takes a hybrid approach. It splits the work between the frontend and the backend to offer useful suggestions for completing filters. The frontend uses the parser built with Lezer to determine the context of the completion: do we complete a column name, an operator, or a value? It also extracts the column name if we are completing something else. It forwards the result to the backend through the /api/v0/console/filter/complete endpoint. Walking the syntax tree was not as easy as I thought, but unit tests helped a lot. The backend uses the parser generated by pigeon to complete a column name or a comparison operator. For values, the completions are either static or extracted from the ClickHouse database. A user can complete an AS number from an organization name thanks to the following snippet:
results := []struct  
  Label  string  ch:"label" 
  Detail string  ch:"detail" 
 
columnName := "DstAS"
sqlQuery := fmt.Sprintf( 
 SELECT concat('AS', toString(%s)) AS label, dictGet('asns', 'name', %s) AS detail
 FROM flows
 WHERE TimeReceived > date_sub(minute, 1, now())
 AND detail != ''
 AND positionCaseInsensitive(detail, $1) >= 1
 GROUP BY label, detail
 ORDER BY COUNT(*) DESC
 LIMIT 20
 , columnName, columnName)
if err := conn.Select(ctx, &results, sqlQuery, input.Prefix); err != nil  
  c.r.Err(err).Msg("unable to query database")
  break
 
for _, result := range results  
  completions = append(completions, filterCompletion 
    Label:  result.Label,
    Detail: result.Detail,
    Quoted: false,
   )
 
In my opinion, the completion system is a major factor in making the field editor an efficient way to select flows. While a query builder may have been more beginner-friendly, the completion system s ease of use and functionality make it more enjoyable to use once you become familiar.

  1. Moreover, building a query builder did not seem like a fun task for me.
  2. They were introduced in 2004 in Parsing Expression Grammars: A Recognition-Based Syntactic Foundation. LR parsers were introduced in 1965, LALR parsers in 1969, and LL parsers in the 1970s. Yacc, a popular parser generator, was written in 1975.
  3. The parser returns a string. It does not generate an intermediate AST. This makes it simpler and it currently fits our needs.
  4. It could be manually translated to JavaScript with PEG.js.

9 February 2023

Jonathan McDowell: Building a read-only Debian root setup: Part 2

This is the second part of how I build a read-only root setup for my router. You might want to read part 1 first, which covers the initial boot and general overview of how I tie the pieces together. This post will describe how I build the squashfs image that forms the main filesystem. Most of the build is driven from a script, make-router, which I ll dissect below. It s highly tailored to my needs, and this is a fairly lengthy post, but hopefully the steps I describe prove useful to anyone trying to do something similar.
Breakdown of make-router
#!/bin/bash
# Either rb3011 (arm) or rb5009 (arm64)
#HOSTNAME="rb3011"
HOSTNAME="rb5009"
if [ "x$ HOSTNAME " == "xrb3011" ]; then
	ARCH=armhf
elif [ "x$ HOSTNAME " == "xrb5009" ]; then
	ARCH=arm64
else
	echo "Unknown host: $ HOSTNAME "
	exit 1
fi

It s a bash script, and I allow building for either my RB3011 or RB5009, which means a different architecture (32 vs 64 bit). I run this script on my Pi 4 which means I don t have to mess about with QemuUserEmulation.
BASE_DIR=$(dirname $0)
IMAGE_FILE=$(mktemp --tmpdir router.$ ARCH .XXXXXXXXXX.img)
MOUNT_POINT=$(mktemp -p /mnt -d router.$ ARCH .XXXXXXXXXX)
# Build and mount an ext4 image file to put the root file system in
dd if=/dev/zero bs=1 count=0 seek=1G of=$ IMAGE_FILE 
mkfs -t ext4 $ IMAGE_FILE 
mount -o loop $ IMAGE_FILE  $ MOUNT_POINT 

I build the image in a loopback ext4 file on tmpfs (my Pi4 is the 8G model), which makes things a bit faster.
# Add dpkg excludes
mkdir -p $ MOUNT_POINT /etc/dpkg/dpkg.cfg.d/
cat <<EOF > $ MOUNT_POINT /etc/dpkg/dpkg.cfg.d/path-excludes
# Exclude docs
path-exclude=/usr/share/doc/*
# Only locale we want is English
path-exclude=/usr/share/locale/*
path-include=/usr/share/locale/en*/*
path-include=/usr/share/locale/locale.alias
# No man pages
path-exclude=/usr/share/man/*
EOF

Create a dpkg excludes config to drop docs, man pages and most locales before we even start the bootstrap.
# Setup fstab + mtab
echo "# Empty fstab as root is pre-mounted" > $ MOUNT_POINT /etc/fstab
ln -s ../proc/self/mounts $ MOUNT_POINT /etc/mtab
# Setup hostname
echo $ HOSTNAME  > $ MOUNT_POINT /etc/hostname
# Add the root SSH keys
mkdir -p $ MOUNT_POINT /root/.ssh/
cat <<EOF > $ MOUNT_POINT /root/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAv8NkUeVdsVdegS+JT9qwFwiHEgcC9sBwnv6RjpH6I4d3im4LOaPOatzneMTZlH8Gird+H4nzluciBr63hxmcFjZVW7dl6mxlNX2t/wKvV0loxtEmHMoI7VMCnrWD0PyvwJ8qqNu9cANoYriZRhRCsBi27qPNvI741zEpXN8QQs7D3sfe4GSft9yQplfJkSldN+2qJHvd0AHKxRdD+XTxv1Ot26+ZoF3MJ9MqtK+FS+fD9/ESLxMlOpHD7ltvCRol3u7YoaUo2HJ+u31l0uwPZTqkPNS9fkmeCYEE0oXlwvUTLIbMnLbc7NKiLgniG8XaT0RYHtOnoc2l2UnTvH5qsQ== noodles@earth.li
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDQb9+qFemcwKhey3+eTh5lxp+3sgZXW2HQQEZMt9hPvVXk+MiiNMx9WUzxPJnwXqlmmVdKsq+AvjA0i505Pp8fIj5DdUBpSqpLghmzpnGuob7SSwXYj+352hjD52UC4S0KMKbIaUpklADgsCbtzhYYc4WoO8F7kK63tS5qa1XSZwwRwPbYOWBcNocfr9oXCVWD9ismO8Y0l75G6EyW8UmwYAohDaV83pvJxQerYyYXBGZGY8FNjqVoOGMRBTUcLj/QTo0CDQvMtsEoWeCd0xKLZ3gjiH3UrknkaPra557/TWymQ8Oh15aPFTr5FvKgAlmZaaM0tP71SOGmx7GpCsP4jZD1Xj/7QMTAkLXb+Ou6yUOVM9J4qebdnmF2RGbf1bwo7xSIX6gAYaYgdnppuxqZX1wyAy+A2Hie4tUjMHKJ6OoFwBsV1sl+3FobrPn6IuulRCzsq2aLqLey+PHxuNAYdSKo7nIDB3qCCPwHlDK52WooSuuMidX4ujTUw7LDTia9FxAawudblxbrvfTbg3DsiDBAOAIdBV37HOAKu3VmvYSPyqT80DEy8KFmUpCEau59DID9VERkG6PWPVMiQnqgW2Agn1miOBZeIQV8PFjenAySxjzrNfb4VY/i/kK9nIhXn92CAu4nl6D+VUlw+IpQ8PZlWlvVxAtLonpjxr9OTw== noodles@yubikey
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC0I8UHj4IpfqUcGE4cTvLB0d2xmATSUzqtxW6ZhGbZxvQDKJesVW6HunrJ4NFTQuQJYgOXY/o82qBpkEKqaJMEFHTCjcaj3M6DIaxpiRfQfs0nhtzDB6zPiZn9Suxb0s5Qr4sTWd6iI9da72z3hp9QHNAu4vpa4MSNE+al3UfUisUf4l8TaBYKwQcduCE0z2n2FTi3QzmlkOgH4MgyqBBEaqx1tq7Zcln0P0TYZXFtrxVyoqBBIoIEqYxmFIQP887W50wQka95dBGqjtV+d8IbrQ4pB55qTxMd91L+F8n8A6nhQe7DckjS0Xdla52b9RXNXoobhtvx9K2prisagsHT noodles@cup
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBK6iGog3WbNhrmrkglNjVO8/B6m7mN6q1tMm1sXjLxQa+F86ETTLiXNeFQVKCHYrk8f7hK0d2uxwgj6Ixy9k0Cw= noodles@sevai
EOF

Setup fstab, the hostname and SSH keys for root.
# Bootstrap our install
debootstrap \
	--arch=$ ARCH  \
	--include=collectd-core,conntrack,dnsmasq,ethtool,iperf3,kexec-tools,mosquitto,mtd-utils,mtr-tiny,ppp,tcpdump,rng-tools5,ssh,watchdog,wget \
	--exclude=dmidecode,isc-dhcp-client,isc-dhcp-common,makedev,nano \
	bullseye $ MOUNT_POINT  https://deb.debian.org/debian/

Actually do the debootstrap step, including a bunch of extra packages that we want.
# Install mqtt-arp
cp $ BASE_DIR /debs/mqtt-arp_1_$ ARCH .deb $ MOUNT_POINT /tmp
chroot $ MOUNT_POINT  dpkg -i /tmp/mqtt-arp_1_$ ARCH .deb
rm $ MOUNT_POINT /tmp/mqtt-arp_1_$ ARCH .deb
# Frob the mqtt-arp config so it starts after mosquitto
sed -i -e 's/After=.*/After=mosquitto.service/' $ MOUNT_POINT /lib/systemd/system/mqtt-arp.service

I haven t uploaded mqtt-arp to Debian, so I install a locally built package, and ensure it starts after mosquitto (the MQTT broker), given they re running on the same host.
# Frob watchdog so it starts earlier than multi-user
sed -i -e 's/After=.*/After=basic.target/' $ MOUNT_POINT /lib/systemd/system/watchdog.service
# Make sure the watchdog is poking the device file
sed -i -e 's/^#watchdog-device/watchdog-device/' $ MOUNT_POINT /etc/watchdog.conf

watchdog timeouts were particularly an issue on the RB3011, where the default timeout didn t give enough time to reach multiuser mode before it would reset the router. Not helpful, so alter the config to start it earlier (and make sure it s configured to actually kick the device file).
# Clean up docs + locales
rm -r $ MOUNT_POINT /usr/share/doc/*
rm -r $ MOUNT_POINT /usr/share/man/*
for dir in $ MOUNT_POINT /usr/share/locale/*/; do
	if [ "$ dir " != "$ MOUNT_POINT /usr/share/locale/en/" ]; then
		rm -r $ dir 
	fi
done

Clean up any docs etc that ended up installed.
# Set root password to root
echo "root:root"   chroot $ MOUNT_POINT  chpasswd

The only login method is ssh key to the root account though I suppose this allows for someone to execute a privilege escalation from a daemon user so I should probably randomise this. Does need to be known though so it s possible to login via the serial console for debugging.
# Add security to sources.list + update
echo "deb https://security.debian.org/debian-security bullseye-security main" >> $ MOUNT_POINT /etc/apt/sources.list
chroot $ MOUNT_POINT  apt update
chroot $ MOUNT_POINT  apt -y full-upgrade
chroot $ MOUNT_POINT  apt clean
# Cleanup the APT lists
rm $ MOUNT_POINT /var/lib/apt/lists/www.*
rm $ MOUNT_POINT /var/lib/apt/lists/security.*

Pull in any security updates, then clean out the APT lists rather than polluting the image with them.
# Disable the daily APT timer
rm $ MOUNT_POINT /etc/systemd/system/timers.target.wants/apt-daily.timer
# Disable daily dpkg backup
cat <<EOF > $ MOUNT_POINT /etc/cron.daily/dpkg
#!/bin/sh
# Don't do the daily dpkg backup
exit 0
EOF
# We don't want a persistent systemd journal
rmdir $ MOUNT_POINT /var/log/journal

None of these make sense on a router.
# Enable nftables
ln -s /lib/systemd/system/nftables.service \
	$ MOUNT_POINT /etc/systemd/system/sysinit.target.wants/nftables.service

Ensure we have firewalling enabled automatically.
# Add systemd-coredump + systemd-timesync user / group
echo "systemd-timesync:x:998:" >> $ MOUNT_POINT /etc/group
echo "systemd-coredump:x:999:" >> $ MOUNT_POINT /etc/group
echo "systemd-timesync:!*::" >> $ MOUNT_POINT /etc/gshadow
echo "systemd-coredump:!*::" >> $ MOUNT_POINT /etc/gshadow
echo "systemd-timesync:x:998:998:systemd Time Synchronization:/:/usr/sbin/nologin" >> $ MOUNT_POINT /etc/passwd
echo "systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin" >> $ MOUNT_POINT /etc/passwd
echo "systemd-timesync:!*:47358::::::" >> $ MOUNT_POINT /etc/shadow
echo "systemd-coredump:!*:47358::::::" >> $ MOUNT_POINT /etc/shadow
# Create /etc/.pwd.lock, otherwise it'll end up in the overlay
touch $ MOUNT_POINT /etc/.pwd.lock
chmod 600 $ MOUNT_POINT /etc/.pwd.lock

Create a number of users that will otherwise get created at boot, and a lock file that will otherwise get created anyway.
# Copy config files
cp --recursive --preserve=mode,timestamps $ BASE_DIR /etc/* $ MOUNT_POINT /etc/
cp --recursive --preserve=mode,timestamps $ BASE_DIR /etc-$ ARCH /* $ MOUNT_POINT /etc/
chroot $ MOUNT_POINT  chown mosquitto /etc/mosquitto/mosquitto.users
chroot $ MOUNT_POINT  chown mosquitto /etc/ssl/mqtt.home.key

There are config files that are easier to replace wholesale, some of which are specific to the hardware (e.g. related to network interfaces). See below for some more details.
# Build symlinks into flash for boot / modules
ln -s /mnt/flash/lib/modules $ MOUNT_POINT /lib/modules
rmdir $ MOUNT_POINT /boot
ln -s /mnt/flash/boot $ MOUNT_POINT /boot

The kernel + its modules live outside the squashfs image, on the USB flash drive that the image lives on. That makes for easier kernel upgrades.
# Put our git revision into os-release
echo -n "GIT_VERSION=" >> $ MOUNT_POINT /etc/os-release
(cd $ BASE_DIR  ; git describe --tags) >> $ MOUNT_POINT /etc/os-release

Always helpful to be able to check the image itself for what it was built from.
# Add some stuff to root's .bashrc
cat << EOF >> $ MOUNT_POINT /root/.bashrc
alias ls='ls -F --color=auto'
eval "\$(dircolors)"
case "\$TERM" in
xterm* rxvt*)
	PS1="\\[\\e]0;\\u@\\h: \\w\a\\]\$PS1"
	;;
*)
	;;
esac
EOF

Just some niceties for when I do end up logging in.
# Build the squashfs
mksquashfs $ MOUNT_POINT  /tmp/router.$ ARCH .squashfs \
	-comp xz

Actually build the squashfs image.
# Save the installed package list off
chroot $ MOUNT_POINT  dpkg --get-selections > /tmp/wip-installed-packages

Save off the installed package list. This was particularly useful when trying to replicate the existing router setup and making sure I had all the important packages installed. It doesn t really serve a purpose now.
In terms of the config files I copy into /etc, shared across both routers are the following:
Breakdown of shared config
  • apt config (disable recommends, periodic updates):
    • apt/apt.conf.d/10periodic, apt/apt.conf.d/local-recommends
  • Adding a default, empty, locale:
    • default/locale
  • DNS/DHCP:
    • dnsmasq.conf, dnsmasq.d/dhcp-ranges, dnsmasq.d/static-ips
    • hosts, resolv.conf
  • Enabling IP forwarding:
    • sysctl.conf
  • Logs related:
    • logrotate.conf, rsyslog.conf
  • MQTT related:
    • mosquitto/mosquitto.users, mosquitto/conf.d/ssl.conf, mosquitto/conf.d/users.conf, mosquitto/mosquitto.acl, mosquitto/mosquitto.conf
    • mqtt-arp.conf
    • ssl/lets-encrypt-r3.crt, ssl/mqtt.home.key, ssl/mqtt.home.crt
  • PPP configuration:
    • ppp/ip-up.d/0000usepeerdns, ppp/ipv6-up.d/defaultroute, ppp/pap-secrets, ppp/chap-secrets
    • network/interfaces.d/pppoe-wan
The router specific config is mostly related to networking:
Breakdown of router specific config
  • Firewalling:
    • nftables.conf
  • Interfaces:
    • dnsmasq.d/interfaces
    • network/interfaces.d/eth0, network/interfaces.d/p1, network/interfaces.d/p2, network/interfaces.d/p7, network/interfaces.d/p8
  • PPP config (network interface piece):
    • ppp/peers/aquiss
  • SSH keys:
    • ssh/ssh_host_ecdsa_key, ssh/ssh_host_ed25519_key, ssh/ssh_host_rsa_key, ssh/ssh_host_ecdsa_key.pub, ssh/ssh_host_ed25519_key.pub, ssh/ssh_host_rsa_key.pub
  • Monitoring:
    • collectd/collectd.conf, collectd/collectd.conf.d/network.conf

12 January 2023

Jonathan McDowell: Building a read-only Debian root setup: Part 1

I mentioned in the post about upgrading my home internet that part of the work I did was creating a read-only Debian root with a squashfs image. This post covers the details of how I boot with that image; a later post will cover how I build the squashfs image. First, David Reader kindly pointed me at his rodebian setup, which was helpful in making me think about the whole problem but ultimately not the direction I went. Primarily because on the old router (an RB3011) I am space constrained, with only 120M of usable flash, and so ideally I wanted as much as possible of the system in a well compressed filesystem. squashfs seemed like the best option for that, and ultimately I ended up with a 39M image. I ve then used overlayfs to mount a tmpfs, so I get what looks like a writeable system without having to do too many tweaks to the actual install. On the plus side I can then see exactly what is getting written where and decide whether I need to update something in the squashfs. I don t boot with an initrd - for initial testing I booted directly off a USB stick. I ve actually ended up continuing to do this in production, because I ve had no pressing reason to move it all to booting off internal flash (I ve ended up with a Sandisk SDCZ430-032G-G46 which is tiny). However nothing I m going to describe is dependent on that - this would work perfectly well for a initial UBIFS rootfs on internal NAND. So the basic overview is I boot off a minimal rootfs, mount a squashfs, create an appropriate tmpfs, mount an overlayfs that combines the two, then pivotroot into the overlayfs and exec its init so it becomes the rootfs. For the minimal rootfs I started with busybox, in particular I used the armhf busybox-static package from Debian. My RB5009 is an ARM64, but I wanted to be able to test on the RB3011 as well, which is ARMv7. Picking an armhf binary for the minimal rootfs lets me use the same image for both. Using the static build helps reduce the number of pieces involved in putting it all together. The busybox binary goes in /bin. I was able to cheat and chroot into the empty rootfs and call busybox --install -s to create symlinks for all the tools it provides, but I could have done this manually. There s only a handful that are actually needed, but it s amazing how much is crammed into a 1.2M binary. /sbin/init is a shell script:
Contents
#!/bin/ash
# Make sure we have a sane date
if [ -e /data/saved-date ]; then
        CURRENT_DATE=$(date -Iseconds)
        if [ "$ CURRENT_DATE:0:4 " -lt "2022" -o \
                        "$ CURRENT_DATE:0:4 " -gt "2030" ]; then
                echo Setting initial date
                date -s "$(cat /data/saved-date)"
        fi
fi
# Work out what platform we're on
ARCH=$(uname -m)
if [ "$ ARCH " == "aarch64" ]; then
        ARCH=arm64
else
        ARCH=armhf
fi
# Mount a tmpfs to store the changes
mount -t tmpfs root-rw /mnt/overlay/rw
# Make the directories we need in the tmpfs
mkdir /mnt/overlay/rw/upper
mkdir /mnt/overlay/rw/work
# Mount the squashfs and build an overlay root filesystem of it + the tmpfs
mount -t squashfs -o loop /data/router.$ ARCH .squashfs /mnt/overlay/lower
mount -t overlay \
        -o lowerdir=/mnt/overlay/lower,upperdir=/mnt/overlay/rw/upper,workdir=/mnt/overlay/rw/work \
        overlayfs-root /mnt/root
# Build the directories we need within the new root
mkdir /mnt/root/mnt/flash
mkdir /mnt/root/mnt/overlay
mkdir /mnt/root/mnt/overlay/lower
mkdir /mnt/root/mnt/overlay/rw
# Copy any stored state
if [ -e /data/state.$ ARCH .tar ]; then
        echo Restoring stored state
        cd /mnt/root
        tar xf /data/state.$ ARCH .tar
fi
cd /mnt/root
pivot_root . mnt/flash
echo Switching into root filesystem
exec chroot . sh -c "$(cat <<END
mount --move /mnt/flash/mnt/overlay/lower /mnt/overlay/lower
mount --move /mnt/flash/mnt/overlay/rw /mnt/overlay/rw
exec /sbin/init
END
)"
Most of what the script is doing is sorting out the squashfs + tmpfs backed overlayfs that becomes the full root filesystems, but there are a few other bits to note. First, we pick up a saved date from /data/saved-date - the router has no RTC and while it ll sort itself out with NTP once it gets networking up it s useful to make sure we don t end up comically far in the past or future. Second, the script looks at what architecture we re running and picks up an appropriate squashfs image from /data based on that. This let me use the same USB stick for testing on both the RB3011 and the RB5011. Finally we allow for a /data/state.$ ARCH .tar file to let us pick up changes to the rootfs at boot time - this prevents having to rebuild the squashfs image every time there s a persistent change. The other piece that doesn t show up in the script is that the kernel and its modules are all installed into this initial rootfs (and then symlinked from the squashfs). This lets me build a mostly modular kernel, as long as all the necessary drivers to mount the USB stick are built in. Once the system is fully booted the initial rootfs is available at /mnt/flash, by default mounted read-only (to avoid inadvertent writes), but able to be remounted to update the squashfs image, install a new kernel, or update the state tarball. /mnt/overlay/rw/upper/ is where updates to the overlayfs are written, which provides an easy way to see what files are changing, initially to determine what might need tweaked in the squashfs creation process and subsequently to be able to see what needs updated in the state tarball.

7 January 2023

Matthew Garrett: Asking ChatGPT to write my security-sensitive code for me

I have a patchset that makes it safe to use hibernation under Linux even in a secure boot world but it relies on preventing userland from using one of the TPM PCRs[1] that would otherwise be available to it. TPM 2 devices support emulating PCRs in NVRAM, which would obviously be beneficial in not taking away a scare resource, but the docs suggest there's no way to tie a key generation to an NVRAM value which makes this more complicated.

So, why not ask a computer to do this for me? I asked ChatGPT to write me some code that would bind a TPM key to an NVRAM value. After a couple of false starts it told me:To bind an encryption key to the value of a TPM NV index, you can use the TPM2_NV_DefineSpace command to define an NV index with the TPMA_NV_BIND attribute. Woo! That sounds perfect. Only one problem - I can't actually find the TPMA_NV_BIND attribute defined anywhere in the spec. Well, ask the AI again.

Here is the definition of TPMA_NV_BIND from the TPM 2.0 specification:
TPMA_NV_BIND (BIT 17) If SET, the contents of the NV Index are bound to the platform hierarchy.


Ok, that sounds convincing. Where is it in the spec?

The definition of TPMA_NV_BIND is on page 126 of the TPM 2.0 specification, in section 9.7.1 "TPMA_NV".

Huh. Nope. Page 126 says nothing about this. Section 9.7.1 says nothing about this. The TPMA_NV bits are actually described on page 168 of volume 2 of the spec, in section 13.4. Bit 17 is TPMA_NV_OWNERREAD. TPMA_NV_BIND does not appear in the spec anywhere. I told the AI that.

I apologize for the error in my previous response. You are correct that the TPMA_NV_BIND attribute does not appear in the TPM 2.0 specification.

To bind an encryption key to the value of a TPM NV index, you can use the TPM2_NV_DefineSpace command with the TPMA_NV_BIND attribute.


I'm not afraid of being unemployed in the near future.

[1] "Platform Configuration Registers" - a modern TPM has 24 of these. When something security sensitive happens, you generate a hash of that event and pass it to the TPM. The TPM appends that to an existing PCR value and then hashes that concatenated value and sets the PCR to that. This means the PCR value depends not only on the values provided, but also the order they're provided in. Various TPM operations can be made conditional on the PCR values meeting specific criteria.

comment count unavailable comments

1 January 2023

Jonathan McDowell: Free Software Activities for 2022

There is a move to Bring Back Blogging and having recently sorted out my own FreshRSS install I am completely in favour of such a thing. RSS feeds with complete posts, for preference, not just a teaser intro sentence/paragraph. It s also a reminder to me that I should blog more, and what better way to start 2023 than with my traditional recap of my Free Software activities in 2022. For previous years see 2019, 2020 + 2021

Conferences I attended DebConf22 in Prizen, Kosova this year, and finally hit the end of my luck in avoiding COVID. 0/10, would not recommend. Thankfully I didn t come down with symptoms until I got home (I felt fine and tested negative on arrival home, then started to feel terrible the next day and tested again), so I was able to enjoy the conference itself. I also made it to Linux Security Summit Europe 2022, which aligned with work related bits and was interesting. I suspect I would have been better going to LPC 2022 for the hallway track, though I did manage to get some overlap with folk being in town given that both were the same week.

Debian Most of my contributions to Free software continue to happen within Debian. We continue to operate a roughly 3 month rotation for Debian Keyring in terms of handling the regular updates, and I dealt with 2022.03.24, 2022.06.26, 2022.08.11, 2022.09.24, 2022.09.25 + 2022.12.24. There were a few out of cycle updates this year and I handled a couple of them. My other contributions are largely within the Debian Electronics Packaging Team. gcc-xtensa-lx106 saw a few updates, to GCC 11 + enabling D (10 + 11), then to GCC 12 (12). binutils-xtensa-lx106 got some minor packaging cleanups, which also served to force a rebuild with the current binutils source (5). libsigrokdecode got an upload to enable building with Python 3.10 (0.5.3-3). Related, I updated sdcc to a new upstream version (4.2.0+dfsg-1) - it s used for the sigrok-firmware-fx2lafw package and I do have a tendency to play with microcontrollers, so it s good to have a recent version available in the archive. I continue to pay attention to OpenOCD, with a minor set of updates to pull in some fixes from master (0.11.0-2). I was pleased to see the release process for 0.12.0 kick off and have been uploading RCs as they come out (0.12.0~rc1-1, 0.12.0~rc2-1 + 0.12.0~rc3-1). Upstream have been interested in the upcoming bookworm release cycle and I m hopeful we ll get 0.12.0 proper in before freeze. libjaylink also saw an upstream release (0.3.1-1). Package upload sponsorship isn t normally something I get involved with, because I find I have to spend a lot of time checking over things before I m comfortable doing the upload. However I did sponsor an initial upload for sugarjar and an update for mgba (0.10.0+dsfg-1, currently stuck in NEW). Credit to Michel for dealing swiftly with my review comments, and Ryan for producing a nicely reviewable set of changes. As part of the Data Protection Team I responded to various inbound queries to that team. There was also some discussion on debian-vote as part of the DPL election that I engaged with, as well as discussions at DebConf about how we can do things better. For Debian New Members I m mostly inactive as an application manager - we generally seem to have enough available recently. If that changes I ll look at stepping in to help, but I don t see that happening (it got close this year but several people had stood up before I got around to offering). I continue to be involved in Front Desk, having various conversations throughout the year with the rest of the team and occasionally approving some of the checks for new applicants. Towards the end of the year I got involved with the Debian Games Team, largely because I m keen to try and get my Kodi working with libretro based emulators - I d really like to be able to play old style games from the same interface as I can engage with locally stored movies, music and TV. It turns out there are a lot of moving pieces to make that happen, some missing from Debian and others in need of some TLC. I updated retroarch to current upstream (1.13.0+dfsg-1 + 1.13.0+dfsg-2) but while I was doing so upstream did another release. I plan on uploading 1.14.0 once 1.13.0 has migrated to testing. It turned out I also needed to update libretro-core-info (1.13.0-1) and retroarch-assets (1.7.6+git20221024+dfsg-1). In terms of actual emulators I pulled in new versions for genesisplusgx (1.7.4+git20221128-1) and libretro-bsnes-mercury (094+git20220807-1). On the Kodi side I haven t uploaded anything yet. I ve filed an ITP for rcheevos, which is a dependency for game.libretro and I have a fledgling package for game.libretro that I finally got working today. I m not sure if I can get it cleaned up enough in time to make the bookworm release, but I m hoping that at least the libretro piece is in a bit better shape now (though I m aware there are more emulator cores that could do with being updated).

Linux This year was a quiet year for personal Linux contributions. I submitted a minor fix for the qca8081 PHY with speeds lower than 2.5Gb/s that caused me issues on my RB5009.

Personal projects 2022 finally saw a minor releases of onak, 0.6.2, which resulted in a corresponding Debian upload (0.6.2-1). It has a couple of bug fixes but nothing major.. As I said last year it s not dead, just resting, but Sequoia PGP is probably where you should be looking for a modern OpenPGP implementation. I added some basic Debian packaging to mqtt-arp - I didn t bother uploading it as it s a fairly niche package, but I m using it locally.

31 December 2022

Chris Lamb: Favourite films of 2022

In my four most recent posts, I went over the memoirs and biographies, the non-fiction, the fiction and the 'classic' fiction I enjoyed reading in 2022. But in the very last of my roundup posts and in relatively less detail I'll be quickly sketching out the favourite movies that were new to me in 2022:

La Ronde (1950) An all-knowing narrator (Adolf Wohlbr ck) guides us through a series of vignettes in 1900s Vienna a soldier meets an eager young lady of the evening, and later he has an affair with a young lady who becomes a maid and who then does similarly with the young man of the house. On and on it goes, spinning on the carousel of life. A wonderfully beguiling movie, and a reminder that 'innocent' and 'charming' doesn't always imply 'childish'.

A Man Escaped (1956) A Man Escaped is the story of a WW2 resistance fighter who wishes to escape from a prison. Filmed in Robert Bresson's quasi-minimalistic style (and touching on his usual themes of Christian redemption), this deceptively simple-looking film rewards deeper analysis.

The Music Room (1958) In The Music Room, Satyajit Ray (best known for his Apu Trilogy) tells the story of a wealthy landlord who lives a decadent life with his wife and son. His passion is for music, however, and he spends a significant portion of his fortune on concerts held for the locals in his magnificent music room (or 'jalsaghar). However, his obsession for music (and its attendant quest for respect from his peers) is his eventual undoing, as he sacrifices both his family and wealth whilst trying to retain it. Indeed, The Music Room is less of an examination of a single rich man (or of the hypnotic sitar music by Vilayat Khan) than of the dying Bengali landowning class, which I read as symbolic of what was about to vanish from Indian culture in the post-Independence drive toward modernity. Still, this is a multi-layered and textured film, as this is also an intensely moving portrait of a highly-detached man as well.

The Hustler (1961) 'Fast Eddie' Felson is a small-time pool hustler with a lot of talent but a self-destructive attitude. His bravado causes him to challenge the legendary Minnesota Fats to a high-stakes match. The stakes are always much bigger than just money though, as Fast Eddie's ego is on the line. Despite not caring so much about the game, I was completely gripped by Paul Newman's superb performance. Almost certainly the best film about pool ever made, and probably the best sports film as well.

Picnic at Hanging Rock (1975) One Valentine's Day in early-1900s Australia, an upper-class school takes its girls on a field trip to a scenic volcanic formation in the middle of the brush. Despite all the rules against it, however, several of the girls venture off onto the rock, and it's not until the end of the day that the group realises some of the girls and one of the teachers have disappeared without a trace. A mesmerising film about colonialism, sexual repression and Antonioni-esque alienation, Picnic at Hanging Rock's dreamlike aesthetic is haunting.

Jeanne Dielman, 23, quai du Commerce, 1080 Bruxelles (1976) Jeanne Dielman is a three-and-a-half-hour film that shows a widowed housewife doing her daily chores and taking care of her apartment. Yet in its uncompromising nature, it raises the tedium of her life to the level of profundity. I watched this radically political film a few months before it was recently voted as the 'greatest of all time'), but despite its relentless rigorous nature (and long running time), the conclusion when it comes is shattering.

Raise the Red Lantern (1991) Set in China during the 1920s, Raise the Red Lantern tells the story of a young woman who has become the new concubine of a wealthy man. However, as he has three wives already (each living in a separate house within his castle) she becomes are engaged in an extremely complicated competition attention and the privileges that come with it. Richly colourful, both in its visuals and symbolism, this is film both specifically about China at the time and human universals.

Moonlight (2016) A heartbreaking story of a young man's struggle to find himself, told across three defining chapters in his life as he experiences the pain and beauty of falling in love, whilst grappling with his own sexuality within the ambient weather of race and class in the United States. Absolutely spellbinding and visually stunning.

Drive My Car (2021) Yusuke Kafuku is a stage director who is unable to cope with the loss of his wife, and has therefore accepted an invitation to direct Anton Chekhov's [Uncle Vanya][https://en.wikipedia.org/wiki/Uncle_Vanya] at a festival in Hiroshima, Japan. Yet whilst many reviewers have focused on the compelling relationship between Yusuke and Misaki (the introverted young woman who has been appointed to drive his car), what I found devastating was the embedded meditation on language, communication, understanding and empathy, especially concerning what happens when these break down or are no longer possible. Drive My Car touches on the limits of each of these phenomena connecting to the next: being able to speak the same language as another doesn't mean you can communicate with them; and being able to communicate doesn't mean you can understand someone, let alone empathise with them. All of this through the near-ungraspable emotion of grief as well, and the portion of this film where the actress was silently employing Korean Sign Language was the most moving thing I saw this year.

30 December 2022

Chris Lamb: Favourite books of 2022: Non-fiction

In my three most recent posts, I went over the memoirs and biographies, classics and fiction books that I enjoyed the most in 2022. But in the last of my book-related posts for 2022, I'll be going over my favourite works of non-fiction. Books that just missed the cut here include Adam Hochschild's King Leopold's Ghost (1998) on the role of Leopold II of Belgium in the Congo Free State, Johann Hari's Stolen Focus (2022) (a personal memoir on relating to how technology is increasingly fragmenting our attention), Amia Srinivasan's The Right to Sex (2021) (a misleadingly named set of philosophic essays on feminism), Dana Heller et al.'s The Selling of 9/11: How a National Tragedy Became a Commodity (2005), John Berger's mindbending Ways of Seeing (1972) and Louise Richardson's What Terrorists Want (2006).

The Great War and Modern Memory (1975)
Wartime: Understanding and Behavior in the Second World War (1989) Paul Fussell Rather than describe the battles, weapons, geopolitics or big personalities of the two World Wars, Paul Fussell's The Great War and Modern Memory & Wartime are focused instead on how the two wars have been remembered by their everyday participants. Drawing on the memoirs and memories of soldiers and civilians along with a brief comparison with the actual events that shaped them, Fussell's two books are a compassionate, insightful and moving piece of analysis. Fussell primarily sets himself against the admixture of nostalgia and trauma that obscures the origins and unimaginable experience of participating in these wars; two wars that were, in his view, a "perceptual and rhetorical scandal from which total recovery is unlikely." He takes particular aim at the dishonesty of hindsight:
For the past fifty years, the Allied war has been sanitised and romanticised almost beyond recognition by the sentimental, the loony patriotic, the ignorant and the bloodthirsty. I have tried to balance the scales. [And] in unbombed America especially, the meaning of the war [seems] inaccessible.
The author does not engage in any of the customary rose-tinted view of war, yet he remains understanding and compassionate towards those who try to locate a reason within what was quite often senseless barbarism. If anything, his despondency and pessimism about the Second World War (the war that Fussell himself fought in) shines through quite acutely, and this is especially the case in what he chooses to quote from others:
"It was common [ ] throughout the [Okinawa] campaign for replacements to get hit before we even knew their names. They came up confused, frightened, and hopeful, got wounded or killed, and went right back to the rear on the route by which they had come, shocked, bleeding, or stiff. They were forlorn figures coming up to the meat grinder and going right back out of it like homeless waifs, unknown and faceless to us, like unread books on a shelf."
It would take a rather heartless reader to fail to be sobered by this final simile, and an even colder one to view Fussell's citation of such an emotive anecdote to be manipulative. Still, stories and cruel ironies like this one infuse this often-angry book, but it is not without astute and shrewd analysis as well, especially on the many qualitative differences between the two conflicts that simply cannot be captured by facts and figures alone. For example:
A measure of the psychological distance of the Second [World] War from the First is the rarity, in 1914 1918, of drinking and drunkenness poems.
Indeed so. In fact, what makes Fussell's project so compelling and perhaps even unique is that he uses these non-quantitive measures to try and take stock of what happened. After all, this was a war conducted by humans, not the abstract school of statistics. And what is the value of a list of armaments destroyed by such-and-such a regiment when compared with truly consequential insights into both how the war affected, say, the psychology of postwar literature ("Prolonged trench warfare, whether enacted or remembered, fosters paranoid melodrama, which I take to be a primary mode in modern writing."), the specific words adopted by combatants ("It is a truism of military propaganda that monosyllabic enemies are easier to despise than others") as well as the very grammar of interaction:
The Field Service Post Card [in WW1] has the honour of being the first widespread exemplary of that kind of document which uniquely characterises the modern world: the "Form". [And] as the first widely known example of dehumanised, automated communication, the post card popularised a mode of rhetoric indispensable to the conduct of later wars fought by great faceless conscripted armies.
And this wouldn't be a book review without argument-ending observations that:
Indicative of the German wartime conception [of victory] would be Hitler and Speer's elaborate plans for the ultimate reconstruction of Berlin, which made no provision for a library.
Our myths about the two world wars possess an undisputed power, in part because they contain an essential truth the atrocities committed by Germany and its allies were not merely extreme or revolting, but their full dimensions (embodied in the Holocaust and the Holodomor) remain essentially inaccessible within our current ideological framework. Yet the two wars are better understood as an abyss in which we were all dragged into the depths of moral depravity, rather than a battle pitched by the forces of light against the forces of darkness. Fussell is one of the few observers that can truly accept and understand this truth and is still able to speak to us cogently on the topic from the vantage point of experience. The Second World War which looms so large in our contemporary understanding of the modern world (see below) may have been necessary and unavoidable, but Fussell convinces his reader that it was morally complicated "beyond the power of any literary or philosophic analysis to suggest," and that the only way to maintain a na ve belief in the myth that these wars were a Manichaean fight between good and evil is to overlook reality. There are many texts on the two World Wars that can either stir the intellect or move the emotions, but Fussell's two books do both. A uniquely perceptive and intelligent commentary; outstanding.

Longitude (1995) Dava Sobel Since Man first decided to sail the oceans, knowing one's location has always been critical. Yet doing so reliably used to be a serious problem if you didn't know where you were, you are far more likely to die and/or lose your valuable cargo. But whilst finding one's latitude (ie. your north south position) had effectively been solved by the beginning of the 17th century, finding one's (east west) longitude was far from trustworthy in comparison. This book first published in 1995 is therefore something of an anachronism. As in, we readily use the GPS facilities of our phones today without hesitation, so we find it difficult to imagine a reality in which knowing something fundamental like your own location is essentially unthinkable. It became clear in the 18th century, though, that in order to accurately determine one's longitude, what you actually needed was an accurate clock. In Longitude, therefore, we read of the remarkable story of John Harrison and his quest to create a timepiece that would not only keep time during a long sea voyage but would survive the rough ocean conditions as well. Self-educated and a carpenter by trade, Harrison made a number of important breakthroughs in keeping accurate time at sea, and Longitude describes his novel breakthroughs in a way that is both engaging and without talking down to the reader. Still, this book covers much more than that, including the development of accurate longitude going hand-in-hand with advancements in cartography as well as in scientific experiments to determine the speed of light: experiments that led to the formulation of quantum mechanics. It also outlines the work being done by Harrison's competitors. 'Competitors' is indeed the correct word here, as Parliament offered a huge prize to whoever could create such a device, and the ramifications of this tremendous financial incentive are an essential part of this story. For the most part, though, Longitude sticks to the story of Harrison and his evolving obsession with his creating the perfect timepiece. Indeed, one reason that Longitude is so resonant with readers is that many of the tropes of the archetypical 'English inventor' are embedded within Harrison himself. That is to say, here is a self-made man pushing against the establishment of the time, with his groundbreaking ideas being underappreciated in his life, or dishonestly purloined by his intellectual inferiors. At the level of allegory, then, I am minded to interpret this portrait of Harrison as a symbolic distillation of postwar Britain a nation acutely embarrassed by the loss of the Empire that is now repositioning itself as a resourceful but plucky underdog; a country that, with a combination of the brains of boffins and a healthy dose of charisma and PR, can still keep up with the big boys. (It is this same search for postimperial meaning I find in the fiction of John le Carr , and, far more famously, in the James Bond franchise.) All of this is left to the reader, of course, as what makes Longitute singularly compelling is its gentle manner and tone. Indeed, at times it was as if the doyenne of sci-fi Ursula K. LeGuin had a sideline in popular non-fiction. I realise it's a mark of critical distinction to downgrade the importance of popular science in favour of erudite academic texts, but Latitude is ample evidence that so-called 'pop' science need not be patronising or reductive at all.

Closed Chambers: The Rise, Fall, and Future of the Modern Supreme Court (1998) Edward Lazarus After the landmark decision by the U.S. Supreme Court in *Dobbs v. Jackson Women's Health Organization that ended the Constitutional right to abortion conferred by Roe v Wade, I prioritised a few books in the queue about the judicial branch of the United States. One of these books was Closed Chambers, which attempts to assay, according to its subtitle, "The Rise, Fall and Future of the Modern Supreme Court". This book is not merely simply a learned guide to the history and functioning of the Court (although it is completely creditable in this respect); it's actually an 'insider' view of the workings of the institution as Lazurus was a clerk for Justice Harry Blackmun during the October term of 1988. Lazarus has therefore combined his experience as a clerk and his personal reflections (along with a substantial body of subsequent research) in order to communicate the collapse in comity between the Justices. Part of this book is therefore a pure history of the Court, detailing its important nineteenth-century judgements (such as Dred Scott which ruled that the Constitution did not consider Blacks to be citizens; and Plessy v. Ferguson which failed to find protection in the Constitution against racial segregation laws), as well as many twentieth-century cases that touch on the rather technical principle of substantive due process. Other layers of Lazurus' book are explicitly opinionated, however, and they capture the author's assessment of the Court's actions in the past and present [1998] day. Given the role in which he served at the Court, particular attention is given by Lazarus to the function of its clerks. These are revealed as being far more than the mere amanuenses they were hitherto believed to be. Indeed, the book is potentially unique in its the claim that the clerks have played a pivotal role in the deliberations, machinations and eventual rulings of the Court. By implication, then, the clerks have plaedy a crucial role in the internal controversies that surround many of the high-profile Supreme Court decisions decisions that, to the outsider at least, are presented as disinterested interpretations of Constitution of the United States. This is of especial importance given that, to Lazarus, "for all the attention we now pay to it, the Court remains shrouded in confusion and misunderstanding." Throughout his book, Lazarus complicates the commonplace view that the Court is divided into two simple right vs. left political factions, and instead documents an ever-evolving series of loosely held but strongly felt series of cabals, quid pro quo exchanges, outright equivocation and pure personal prejudices. (The age and concomitant illnesses of the Justices also appears to have a not insignificant effect on the Court's rulings as well.) In other words, Closed Chambers is not a book that will be read in a typical civics class in America, and the only time the book resorts to the customary breathless rhetoric about the US federal government is in its opening chapter:
The Court itself, a Greek-style temple commanding the crest of Capitol Hill, loomed above them in the dim light of the storm. Set atop a broad marble plaza and thirty-six steps, the Court stands in splendid isolation appropriate to its place at the pinnacle of the national judiciary, one of the three independent and "coequal" branches of American government. Once dubbed the Ivory Tower by architecture critics, the Court has a Corinthian colonnade and massive twenty-foot-high bronze doors that guard the single most powerful judicial institution in the Western world. Lights still shone in several offices to the right of the Court's entrance, and [ ]
Et cetera, et cetera. But, of course, this encomium to the inherent 'nobility' of the Supreme Court is quickly revealed to be a narrative foil, as Lazarus soon razes this dangerously na ve conception to the ground:
[The] institution is [now] broken into unyielding factions that have largely given up on a meaningful exchange of their respective views or, for that matter, a meaningful explication or defense of their own views. It is of Justices who in many important cases resort to transparently deceitful and hypocritical arguments and factual distortions as they discard judicial philosophy and consistent interpretation in favor of bottom-line results. This is a Court so badly splintered, yet so intent on lawmaking, that shifting 5-4 majorities, or even mere pluralities, rewrite whole swaths of constitutional law on the authority of a single, often idiosyncratic vote. It is also a Court where Justices yield great and excessive power to immature, ideologically driven clerks, who in turn use that power to manipulate their bosses and the institution they ostensibly serve.
Lazurus does not put forward a single, overarching thesis, but in the final chapters, he does suggest a potential future for the Court:
In the short run, the cure for what ails the Court lies solely with the Justices. It is their duty, under the shield of life tenure, to recognize the pathologies affecting their work and to restore the vitality of American constitutionalism. Ultimately, though, the long-term health of the Court depends on our own resolve on whom [we] select to join that institution.
Back in 1998, Lazurus might have had room for this qualified optimism. But from the vantage point of 2022, it appears that the "resolve" of the United States citizenry was not muscular enough to meet his challenge. After all, Lazurus was writing before Bush v. Gore in 2000, which arrogated to the judicial branch the ability to decide a presidential election; the disillusionment of Barack Obama's failure to nominate a replacement for Scalia; and many other missteps in the Court as well. All of which have now been compounded by the Trump administration's appointment of three Republican-friendly justices to the Court, including hypocritically appointing Justice Barrett a mere 38 days before the 2020 election. And, of course, the leaking and ruling in Dobbs v. Jackson, the true extent of which has not been yet. Not of a bit of this is Lazarus' fault, of course, but the Court's recent decisions (as well as the liberal hagiographies of 'RBG') most perforce affect one's reading of the concluding chapters. The other slight defect of Closed Chambers is that, whilst it often implies the importance of the federal and state courts within the judiciary, it only briefly positions the Supreme Court's decisions in relation to what was happening in the House, Senate and White House at the time. This seems to be increasingly relevant as time goes on: after all, it seems fairly clear even to this Brit that relying on an activist Supreme Court to enact progressive laws must be interpreted as a failure of the legislative branch to overcome the perennial problems of the filibuster, culture wars and partisan bickering. Nevertheless, Lazarus' book is in equal parts ambitious, opinionated, scholarly and dare I admit it? wonderfully gossipy. By juxtaposing history, memoir, and analysis, Closed Chambers combines an exacting evaluation of the Court's decisions with a lively portrait of the intellectual and emotional intensity that has grown within the Supreme Court's pseudo-monastic environment all while it struggles with the most impactful legal issues of the day. This book is an excellent and well-written achievement that will likely never be repeated, and a must-read for anyone interested in this ever-increasingly important branch of the US government.

Crashed: How a Decade of Financial Crises Changed the World (2018)
Shutdown: How Covid Shook the World's Economy (2021) Adam Tooze The economic historian Adam Tooze has often been labelled as an unlikely celebrity, but in the fourteen years since the global financial crisis of 2008, a growing audience has been looking for answers about the various failures of the modern economy. Tooze, a professor of history at New York's Columbia University, has written much that is penetrative and thought-provoking on this topic, and as a result, he has generated something of a cult following amongst economists, historians and the online left. I actually read two Tooze books this year. The first, Crashed (2018), catalogues the scale of government intervention required to prop up global finance after the 2008 financial crisis, and it characterises the different ways that countries around the world failed to live up to the situation, such as doing far too little, or taking action far too late. The connections between the high-risk subprime loans, credit default swaps and the resulting liquidity crisis in the US in late 2008 is fairly well known today in part thanks to films such as Adam McKay's 2015 The Big Short and much improved economic literacy in media reportage. But Crashed makes the implicit claim that, whilst the specific and structural origins of the 2008 crisis are worth scrutinising in exacting detail, it is the reaction of states in the months and years after the crash that has been overlooked as a result. After all, this is a reaction that has not only shaped a new economic order, it has created one that does not fit any conventional idea about the way the world 'ought' to be run. Tooze connects the original American banking crisis to the (multiple) European debt crises with a larger crisis of liberalism. Indeed, Tooze somehow manages to cover all these topics and more, weaving in Trump, Brexit and Russia's 2014 annexation of Crimea, as well as the evolving role of China in the post-2008 economic order. Where Crashed focused on the constellation of consequences that followed the events of 2008, Shutdown is a clear and comprehensive account of the way the world responded to the economic impact of Covid-19. The figures are often jaw-dropping: soon after the disease spread around the world, 95% of the world's economies contracted simultaneously, and at one point, the global economy shrunk by approximately 20%. Tooze's keen and sobering analysis of what happened is made all the more remarkable by the fact that it came out whilst the pandemic was still unfolding. In fact, this leads quickly to one of the book's few flaws: by being published so quickly, Shutdown prematurely over-praises China's 'zero Covid' policy, and these remarks will make a reader today squirm in their chair. Still, despite the regularity of these references (after all, mentioning China is very useful when one is directly comparing economic figures in early 2021, for examples), these are actually minor blemishes on the book's overall thesis. That is to say, Crashed is not merely a retelling of what happened in such-and-such a country during the pandemic; it offers in effect a prediction about what might be coming next. Whilst the economic responses to Covid averted what could easily have been another Great Depression (and thus showed it had learned some lessons from 2008), it had only done so by truly discarding the economic rule book. The by-product of inverting this set of written and unwritten conventions that have governed the world for the past 50 years, this 'Washington consensus' if you well, has yet to be fully felt. Of course, there are many parallels between these two books by Tooze. Both the liquidity crisis outlined in Crashed and the economic response to Covid in Shutdown exposed the fact that one of the central tenets of the modern economy ie. that financial markets can be trusted to regulate themselves was entirely untrue, and likely was false from the very beginning. And whilst Adam Tooze does not offer a singular piercing insight (conveying a sense of rigorous mastery instead), he may as well be asking whether we're simply going to lurch along from one crisis to the next, relying on the technocrats in power to fix problems when everything blows up again. The answer may very well be yes.

Looking for the Good War: American Amnesia and the Violent Pursuit of Happiness (2021) Elizabeth D. Samet Elizabeth D. Samet's Looking for the Good War answers the following question what would be the result if you asked a professor of English to disentangle the complex mythology we have about WW2 in the context of the recent US exit of Afghanistan? Samet's book acts as a twenty-first-century update of a kind to Paul Fussell's two books (reviewed above), as well as a deeper meditation on the idea that each new war is seen through the lens of the previous one. Indeed, like The Great War and Modern Memory (1975) and Wartime (1989), Samet's book is a perceptive work of demystification, but whilst Fussell seems to have been inspired by his own traumatic war experience, Samet is not only informed by her teaching West Point military cadets but by the physical and ontological wars that have occurred during her own life as well. A more scholarly and dispassionate text is the result of Samet's relative distance from armed combat, but it doesn't mean Looking for the Good War lacks energy or inspiration. Samet shares John Adams' belief that no political project can entirely shed the innate corruptions of power and ambition and so it is crucial to analyse and re-analyse the role of WW2 in contemporary American life. She is surely correct that the Second World War has been universally elevated as a special, 'good' war. Even those with exceptionally giddy minds seem to treat WW2 as hallowed:
It is nevertheless telling that one of the few occasions to which Trump responded with any kind of restraint while he was in office was the 75th anniversary of D-Day in 2019.
What is the source of this restraint, and what has nurtured its growth in the eight decades since WW2 began? Samet posits several reasons for this, including the fact that almost all of the media about the Second World War is not only suffused with symbolism and nostalgia but, less obviously, it has been made by people who have no experience of the events that they depict. Take Stephen Ambrose, author of Steven Spielberg's Band of Brothers miniseries: "I was 10 years old when the war ended," Samet quotes of Ambrose. "I thought the returning veterans were giants who had saved the world from barbarism. I still think so. I remain a hero worshiper." If Looking for the Good War has a primary thesis, then, it is that childhood hero worship is no basis for a system of government, let alone a crusading foreign policy. There is a straight line (to quote this book's subtitle) from the "American Amnesia" that obscures the reality of war to the "Violent Pursuit of Happiness." Samet's book doesn't merely just provide a modern appendix to Fussell's two works, however, as it adds further layers and dimensions he overlooked. For example, Samet provides some excellent insight on the role of Western, gangster and superhero movies, and she is especially good when looking at noir films as a kind of kaleidoscopic response to the Second World War:
Noir is a world ruled by bad decisions but also by bad timing. Chance, which plays such a pivotal role in war, bleeds into this world, too.
Samet rightfully weaves the role of women into the narrative as well. Women in film noir are often celebrated as 'independent' and sassy, correctly reflecting their newly-found independence gained during WW2. But these 'liberated' roles are not exactly a ringing endorsement of this independence: the 'femme fatale' and the 'tart', etc., reflect a kind of conditional freedom permitted to women by a post-War culture which is still wedded to an outmoded honour culture. In effect, far from being novel and subversive, these roles for women actually underwrote the ambient cultural disapproval of women's presence in the workforce. Samet later connects this highly-conditional independence with the liberation of Afghan women, which:
is inarguably one of the more palatable outcomes of our invasion, and the protection of women's rights has been invoked on the right and the left as an argument for staying the course in Afghanistan. How easily consequence is becoming justification. How flattering it will be one day to reimagine it as original objective.
Samet has ensured her book has a predominantly US angle as well, for she ends her book with a chapter on the pseudohistorical Lost Cause of the Civil War. The legacy of the Civil War is still visible in the physical phenomena of Confederate statues, but it also exists in deep-rooted racial injustice that has been shrouded in euphemism and other psychological devices for over 150 years. Samet believes that a key part of what drives the American mythology about the Second World War is the way in which it subconsciously cleanses the horrors of brother-on-brother murder that were seen in the Civil War. This is a book that is not only of interest to historians of the Second World War; it is a work for anyone who wishes to understand almost any American historical event, social issue, politician or movie that has appeared since the end of WW2. That is for better or worse everyone on earth.

Utkarsh Gupta: FOSS Activites in December 2022

Here s my (thirty-ninth) monthly but brief update about the activities I ve done in the F/L/OSS world.

Debian
This was my 48th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas 19! \o/ There s a bunch of things I do, both, technical and non-technical. Here are the things I did this month:
  • Some DebConf work.
  • Sponsoring stuff for non-DDs.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Ubuntu
This was my 23rd month of actively contributing to Ubuntu. Now that I joined Canonical to work on Ubuntu full-time, there s a bunch of things I do! \o/ I mostly worked on different things, I guess. I was too lazy to maintain a list of things I worked on so there s no concrete list atm. Maybe I ll get back to this section later or will start to list stuff from the fall, as I was doing before. :D

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the stretch and jessie release (+2 years after LTS support). This was my thirty-ninth month as a Debian LTS and thirtieth month as a Debian ELTS paid contributor.
I worked for 51.50 hours for LTS and 22.50 hours for ELTS.

LTS CVE Fixes and Announcements:

ELTS CVE Fixes and Announcements:
  • Issued ELA 752-1, fixing CVE-2021-41182, CVE-2021-41183, CVE-2021-41184, and CVE-2022-31160, for jqueryui.
    For Debian 9 stretch, these problems have been fixed in version 1.12.1+dfsg-4+deb9u1.
  • Helped facilitate Erlang s and RabbitMQ s update; cf: ELA 754-1.
  • Looked through python3.4 s FTBFS on armhf. Even diff d with Ubuntu. No luck. Inspected the traces, doesn t give a lot of hint either. Will continue to look later next month or so but it s a rabbit hole. (:
  • Inspected joblib s security update upon Helmut s investigation and see what went wrong there.
  • Started to look at other set of packages: dropbear, tiff, et al.

Other (E)LTS Work:
Until next time.
:wq for today.

Simon Josefsson: Preseeding Trisquel Virtual Machines Using netinst Images

I m migrating some self-hosted virtual machines to Trisquel, and noticed that Trisquel does not offer cloud-images similar to the Debian Cloud and Ubuntu Cloud images. Thus my earlier approach based on virt-install --cloud-init and cloud-localds does not work with Trisquel. While I hope that Trisquel will eventually publish cloud-compatible images, I wanted to document an alternative approach for Trisquel based on preseeding. This is how I used to install Debian and Ubuntu in the old days, and the automated preseed method is best documented in the Debian installation manual. I was hoping to forget about the preseed format, but maybe it will become one of those legacy technologies that never really disappears? Like FAT16 and 8-bit microcontrollers. Below I assume you have a virtual machine host server up that runs libvirt and has virt-install and similar tools; install them with the following command. I run a pre-release version of Trisquel 11 aramo on my VM-host, but I believe any recent dpkg-based distribution like Trisquel 9/10, PureOS 10, Debian 11 or Ubuntu 20.04/22.04 would work.
apt-get install libvirt-daemon-system virtinst genisoimage cloud-image-utils osinfo-db-tools
The approach can install Trisquel 9 (etiona), Trisquel 10 (nabia) and the pre-release of Trisquel 11. First download and verify the integrity of the netinst images that we will need. Unfortunately the Trisquel 11 netinst beta image does not have any checksum or signature available.
mkdir -p /root/iso
cd /root/iso
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_9.0.2_amd64.iso
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_9.0.2_amd64.iso.asc
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_9.0.2_amd64.iso.sha256
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_10.0.1_amd64.iso
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_10.0.1_amd64.iso.asc
wget -q https://mirror.fsf.org/trisquel-images/trisquel-netinst_10.0.1_amd64.iso.sha256
wget -q -O- https://archive.trisquel.info/trisquel/trisquel-archive-signkey.gpg   gpg --import
sha256sum -c trisquel-netinst_9.0.2_amd64.iso.sha256
gpg --verify trisquel-netinst_9.0.2_amd64.iso.asc
sha256sum -c trisquel-netinst_10.0.1_amd64.iso.sha256
gpg --verify trisquel-netinst_10.0.1_amd64.iso.asc
wget -q https://cdbuilds.trisquel.org/aramo/trisquel-netinst_11.0-20221225_amd64.iso
echo '179566639ca8f14f0c3d5658209c59a0916d9e3bf9c026660cc07b28f2311631  trisquel-netinst_11.0-20221225_amd64.iso'   sha256sum -c
I have developed the following fairly minimal preseed file that works with all three Trisquel releases. Compare it against the official Trisquel 11 preseed skeleton and the Debian 11 example preseed file. You should modify obvious things like SSH key, host/IP settings, partition layout and decide for yourself how to deal with passwords. While Ubuntu/Trisquel usually wants to setup a user account, I prefer to login as root hence setting passwd/root-login to true and passwd/make-user to false.

root@trana:~# cat>trisquel.preseed 
d-i debian-installer/locale select en_US
d-i keyboard-configuration/xkb-keymap select us
d-i netcfg/choose_interface select auto
d-i netcfg/disable_autoconfig boolean true
d-i netcfg/get_ipaddress string 192.168.10.201
d-i netcfg/get_netmask string 255.255.255.0
d-i netcfg/get_gateway string 192.168.10.46
d-i netcfg/get_nameservers string 192.168.10.46
d-i netcfg/get_hostname string trisquel
d-i netcfg/get_domain string sjd.se
d-i clock-setup/utc boolean true
d-i time/zone string UTC
d-i mirror/country string manual
d-i mirror/http/hostname string ftp.acc.umu.se
d-i mirror/http/directory string /mirror/trisquel/packages
d-i mirror/http/proxy string
d-i partman-auto/method string regular
d-i partman-partitioning/confirm_write_new_label boolean true
d-i partman/choose_partition select finish
d-i partman/confirm boolean true
d-i partman/confirm_nooverwrite boolean true
d-i partman-basicfilesystems/no_swap boolean false
d-i partman-auto/expert_recipe string myroot :: 1000 50 -1 ext4 \
     $primary    $bootable    method  format   \
     format    use_filesystem    filesystem  ext4   \
     mountpoint  /   \
    .
d-i partman-auto/choose_recipe select myroot
d-i passwd/root-login boolean true
d-i user-setup/allow-password-weak boolean true
d-i passwd/root-password password r00tme
d-i passwd/root-password-again password r00tme
d-i passwd/make-user boolean false
tasksel tasksel/first multiselect
d-i pkgsel/include string openssh-server
popularity-contest popularity-contest/participate boolean false
d-i grub-installer/only_debian boolean true
d-i grub-installer/with_other_os boolean true
d-i grub-installer/bootdev string default
d-i finish-install/reboot_in_progress note
d-i preseed/late_command string mkdir /target/root/.ssh ; echo ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILzCFcHHrKzVSPDDarZPYqn89H5TPaxwcORgRg+4DagE cardno:FFFE67252015 > /target/root/.ssh/authorized_keys
^D
root@trana:~# 
Use the file above as a skeleton for preparing a VM-specific preseed file as follows. The environment variables HOST and IPS will be used later on too.

root@trana:~# HOST=foo
root@trana:~# IP=192.168.10.197
root@trana:~# sed -e "s,get_ipaddress string.*,get_ipaddress string $IP," -e "s,get_hostname string.*,get_hostname string $HOST," < trisquel.preseed > vm-$HOST.preseed
root@trana:~# 
The following script is used to prepare the ISO images with the preseed file that we will need. This script is inspired by the Debian Wiki Preseed EditIso page and the Trisquel ISO customization wiki page. There are a couple of variations based on earlier works. Paths are updated to match the Trisquel netinst ISO layout, which differ slightly from Debian. We modify isolinux.cfg to boot the auto label without a timeout. On Trisquel 11 the auto boot label exists, but on Trisquel 9 and Trisquel 10 it does not exist so we add it in order to be able to start the automated preseed installation.

root@trana:~# cat gen-preseed-iso 
#!/bin/sh
# Copyright (C) 2018-2022 Simon Josefsson -- GPLv3+
# https://wiki.debian.org/DebianInstaller/Preseed/EditIso
# https://trisquel.info/en/wiki/customizing-trisquel-iso
set -e
set -x
ISO="$1"
PRESEED="$2"
OUTISO="$3"
LASTPWD="$PWD"
test -f "$ISO"
test -f "$PRESEED"
test ! -f "$OUTISO"
TMPDIR=$(mktemp -d)
mkdir "$TMPDIR/mnt"
mkdir "$TMPDIR/tmp"
cp "$PRESEED" "$TMPDIR"/preseed.cfg
cd "$TMPDIR"
mount "$ISO" mnt/
cp -rT mnt/ tmp/
umount mnt/
chmod +w -R tmp/
gunzip tmp/initrd.gz
echo preseed.cfg   cpio -H newc -o -A -F tmp/initrd
gzip tmp/initrd
chmod -w -R tmp/
sed -i "s/timeout 0/timeout 1/" tmp/isolinux.cfg
sed -i "s/default vesamenu.c32/default auto/" tmp/isolinux.cfg
if ! grep -q auto tmp/adtxt.cfg; then
    cat<<EOF >> tmp/adtxt.cfg
label auto
	menu label ^Automated install
	kernel linux
	append auto=true priority=critical vga=788 initrd=initrd.gz --- quiet
EOF
fi
cd tmp/
find -follow -type f   xargs md5sum  > md5sum.txt
cd ..
cd "$LASTPWD"
genisoimage -r -J -b isolinux.bin -c boot.cat \
            -no-emul-boot -boot-load-size 4 -boot-info-table \
            -o "$OUTISO" "$TMPDIR/tmp/"
rm -rf "$TMPDIR"
exit 0
^D
root@trana:~# chmod +x gen-preseed-iso 
root@trana:~# 
Next run the command on one of the downloaded ISO image and the generated preseed file.

root@trana:~# ./gen-preseed-iso /root/iso/trisquel-netinst_10.0.1_amd64.iso vm-$HOST.preseed vm-$HOST.iso
+ ISO=/root/iso/trisquel-netinst_10.0.1_amd64.iso
+ PRESEED=vm-foo.preseed
+ OUTISO=vm-foo.iso
+ LASTPWD=/root
+ test -f /root/iso/trisquel-netinst_10.0.1_amd64.iso
+ test -f vm-foo.preseed
+ test ! -f vm-foo.iso
+ mktemp -d
+ TMPDIR=/tmp/tmp.mNEprT4Tx9
+ mkdir /tmp/tmp.mNEprT4Tx9/mnt
+ mkdir /tmp/tmp.mNEprT4Tx9/tmp
+ cp vm-foo.preseed /tmp/tmp.mNEprT4Tx9/preseed.cfg
+ cd /tmp/tmp.mNEprT4Tx9
+ mount /root/iso/trisquel-netinst_10.0.1_amd64.iso mnt/
mount: /tmp/tmp.mNEprT4Tx9/mnt: WARNING: source write-protected, mounted read-only.
+ cp -rT mnt/ tmp/
+ umount mnt/
+ chmod +w -R tmp/
+ gunzip tmp/initrd.gz
+ echo preseed.cfg
+ cpio -H newc -o -A -F tmp/initrd
5 blocks
+ gzip tmp/initrd
+ chmod -w -R tmp/
+ sed -i s/timeout 0/timeout 1/ tmp/isolinux.cfg
+ sed -i s/default vesamenu.c32/default auto/ tmp/isolinux.cfg
+ grep -q auto tmp/adtxt.cfg
+ cat
+ cd tmp/
+ find -follow -type f
+ xargs md5sum
+ cd ..
+ cd /root
+ genisoimage -r -J -b isolinux.bin -c boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -o vm-foo.iso /tmp/tmp.mNEprT4Tx9/tmp/
I: -input-charset not specified, using utf-8 (detected in locale settings)
Using GCRY_000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/gcry_sha512.mod (gcry_sha256.mod)
Using XNU_U000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/xnu_uuid.mod (xnu_uuid_test.mod)
Using PASSW000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/password_pbkdf2.mod (password.mod)
Using PART_000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/part_sunpc.mod (part_sun.mod)
Using USBSE000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/usbserial_pl2303.mod (usbserial_ftdi.mod)
Using USBSE001.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/usbserial_ftdi.mod (usbserial_usbdebug.mod)
Using VIDEO000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/videotest.mod (videotest_checksum.mod)
Using GFXTE000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/gfxterm_background.mod (gfxterm_menu.mod)
Using GCRY_001.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/gcry_sha256.mod (gcry_sha1.mod)
Using MULTI000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/multiboot2.mod (multiboot.mod)
Using USBSE002.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/usbserial_usbdebug.mod (usbserial_common.mod)
Using MDRAI000.MOD;1 for  /tmp/tmp.mNEprT4Tx9/tmp/boot/grub/x86_64-efi/mdraid09.mod (mdraid09_be.mod)
Size of boot image is 4 sectors -> No emulation
 22.89% done, estimate finish Thu Dec 29 23:36:18 2022
 45.70% done, estimate finish Thu Dec 29 23:36:18 2022
 68.56% done, estimate finish Thu Dec 29 23:36:18 2022
 91.45% done, estimate finish Thu Dec 29 23:36:18 2022
Total translation table size: 2048
Total rockridge attributes bytes: 24816
Total directory bytes: 40960
Path table size(bytes): 64
Max brk space used 46000
21885 extents written (42 MB)
+ rm -rf /tmp/tmp.mNEprT4Tx9
+ exit 0
root@trana:~#
Now the image is ready for installation, so invoke virt-install as follows. The machine will start directly, launching the preseed automatic installation. At this point, I usually click on the virtual machine in virt-manager to follow screen output until the installation has finished. If everything works OK the machines comes up and I can ssh into it.

root@trana:~# virt-install --name $HOST --disk vm-$HOST.img,size=5 --cdrom vm-$HOST.iso --osinfo linux2020 --autostart --noautoconsole --wait
Using linux2020 default --memory 4096
Starting install...
Allocating 'vm-foo.img'                                                                                                                                     0 B  00:00:00 ... 
Creating domain...                                                                                                                                          0 B  00:00:00     
Domain is still running. Installation may be in progress.
Waiting for the installation to complete.
Domain has shutdown. Continuing.
Domain creation completed.
Restarting guest.
root@trana:~# 
There are some problems that I have noticed that would be nice to fix, but are easy to work around. The first is that at the end of the installation of Trisquel 9 and Trisquel 10, the VM hangs after displaying Sent SIGKILL to all processes followed by Requesting system reboot. I kill the VM manually using virsh destroy foo and start it up again using virsh start foo. For production use I expect to be running Trisquel 11, where the problem doesn t happen, so this does not bother me enough to debug further. The remaining issue that once booted, a Trisquel 11 VM has lost its DNS nameserver configuration, presumably due to poor integration with systemd-resolved. Both Trisquel 9 and Trisquel 10 uses systemd-resolved where DNS works after first boot, so this appears to be a Trisquel 11 bug. You can work around it with rm -f /etc/resolv.conf && echo 'nameserver A.B.C.D' > /etc/resolv.conf or drink the systemd Kool-Aid. If you want to clean up and re-start the process, here is how you wipe out what you did. After this, you may run the sed, ./gen-preseed-iso and virt-install commands again. Remember, use virsh shutdown foo to gracefully shutdown a VM.

root@trana:~# virsh destroy foo
Domain 'foo' destroyed
root@trana:~# virsh undefine foo --remove-all-storage
Domain 'foo' has been undefined
Volume 'vda'(/root/vm-foo.img) removed.
root@trana:~# rm vm-foo.*
root@trana:~# 
Happy hacking on your virtal machines!

23 December 2022

Louis-Philippe V ronneau: 2022 A Musical Retrospective

With the end of the year approaching fast, I thought putting my year in retrospective via music would be a fun thing to do. Albums In 2022, I added 51 new albums to my collection nearly one a week! I listed them below in the order in which I acquired them. I purchased most of these albums when I could and borrowed the rest at libraries. If you want to browse though, I added links to the album covers pointing either to websites where you can buy them or to Discogs when digital copies weren't available1. Browsing through the albums, I can see my tastes really shifted a lot in the last few years. I used to listen to a lot of Hip-Hop, but the recent trends in this genre2 really turn me off. In fact, it seems I didn't add a single Hip-Hop album to my collection this year... Metal also continues to dominate the list. Many thanks to Angry Metal Guy for being the best metal reviewing website out there. Concerts 2022 was also a big change for me, as I started going to much more concerts than I previously did. metalfinder has been working great and I'm really happy with it. Here are the concerts I went to in 2022: I'm looking forward continuing to go to a lot of concerts in 2023!

  1. Some of the albums especially the O ! ones are pretty underground. For most of those, I actually have physical copies I bought and ripped.
  2. Mostly mumble rap, beats than are less and less sample-based, extreme commercialisation and lyrics that are less and less political and engaged.

16 December 2022

Simon Josefsson: Guix 1.4 on NV41PZ

On the shortlist of things to try on my new laptop has been Guix. I have been using Guix on my rsnapshot-based backup server since 2018, and experimented using it on a second laptop but never on my primary daily work machine. The main difference with Guix for me, compared to Debian (or Trisquel), is that Guix follows a rolling release model, even though they prepare stable versioned installation images once in a while. It seems the trend for operating system software releases is to either following a Long-Term-Support approach or adopt a rolling approach. Historically I have found that the rolling release approach, such as following Debian testing, has lead to unreliable systems, since little focus was given to system integration stability. This probably changed in the last 10 years or so, and today add-on systems like Homebrew on macOS gives me access to modern releases of free software easily. While I am likely to stay with LTS releases of GNU/Linux on many systems, the experience with rolling Guix (with unattended-upgrades from a cron job to pull in new code continously) on my backup servers has been smooth: no need for re-installation or debugging of installations for over four years! I tried the Guix 1.4 rc2 installation image on top of my previous Trisquel 11 installation; following the guided Guix installation menus was simple. I installed using wired network, since the WiFi dongle I had did not automatically become available. I put the Guix system on a separate partition, that I left empty when I installed Trisquel, and mounted the same /home that I used for Trisquel. Everything booted fine, and while I had some issues doing guix pull followed by guix system reconfigure /etc/config.scm I eventually got it working by using --allow-downgrade once. I believe this was a symptom of using a release candidate installation image. Guix did not auto-detect Trisquel or set up a Grub boot menu for it, and I have been unable to come up with the right Guix bootloader magic to add a Trisquel boot item again. Fortunately, the EFI boot choser allows me to boot Trisquel again. Guix 1.4 uses Linux-libre 6.0 which is newer than Trisquel 11 s Linux-libre 5.15. The WiFi dongle worked automatically once the system was installed. I will continue to tweak the default system configuration that was generated, it seems a standard GNOME installation does not include Evolution on Guix. Everything else I have tested works fine, including closing the lid and suspend and then resume, however the builtin webcam has a distorted image which does not happen on Trisquel. All in all, it seems the resulting system would be usable enough for me. I will be switching between Trisquel and Guix, but expect to spend most of time for daily work within Trisquel because it gives me the stable Debian-like environment that I ve been used to for ~20 years. Sharing the same /home between Trisquel and Guix may have been a mistake: GNOME handles this badly, and the dock will only contain the lowest-common-denominator of available applications, with the rest removed permanently.

12 December 2022

Matthew Garrett: Quick update on Pluton and Linux

I've been ridiculously burned out for a while now but I'm taking the month off to recover and that's giving me an opportunity to catch up on a lot of stuff. This has included me actually writing some code to work with the Pluton in my Thinkpad Z13. I've learned some more stuff in the process, but based on everything I know I'd still say that in its current form Pluton isn't a threat to free software.

So, first up: by default on the Z13, Pluton is disabled. It's not obviously exposed to the OS at all, which also means there's no obvious mechanism for Microsoft to push out a firmware update to it via Windows Update. The Windows drivers that bind to Pluton don't load in this configuration. It's theoretically possible that there's some hidden mechanism to re-enable it at runtime, but that code doesn't seem to be in Windows at the moment. I'm reasonably confident that "Disabled" is pretty genuinely disabled.

Second, when enabled, Pluton exposes two separate devices. The first of these has an MSFT0101 identifier in ACPI, which is the ID used for a TPM 2 device. The Pluton TPM implementation doesn't work out of the box with existing TPM 2 drivers, though, because it uses a custom start method. TPM 2 devices commonly use something called a "Command Response Buffer" architecture, where a command is written into a buffer, the TPM is told to do a thing, and the response to the command ends up in another buffer. The mechanism to tell the TPM to do a thing varies, and an ACPI table exposed to the OS defines which of those various things should be used for a given TPM. Pluton systems have a mechanism that isn't defined in the existing version of the spec (1.3 rev 8 at the time of writing), so I had to spend a while staring hard at the Windows drivers to figure out how to implement it. The good news is that I now have a patch that successfully gets the existing Linux TPM driver code work correctly with the Pluton implementation.

The second device has an MSFT0200 identifier, and is entirely not a TPM. The Windows driver appears to be a relatively thin layer that simply takes commands from userland and passes them on to the chip - I haven't found any userland applications that make use of this, so it's tough to figure out what functionality is actually available. But what does seem pretty clear from the code I've looked at is that it's a component that only responds when it's asked - if the OS never sends it any commands, it's not able to do anything.

One key point from this recently published Microsoft doc is that the whole "Microsoft can update Pluton firmware" thing does just seem to be the ability for the OS to push new code to the chip at runtime. That means Microsoft can't arbitrarily push new firmware to the chip - the OS needs to be involved. This is unsurprising, but it's nice to see some stronger confirmation of that.

Anyway. tl;dr - Pluton can (now) be used as a regular TPM. Pluton also exposes some additional functionality which is not yet clear, but there's no obvious mechanism for it to compromise user privacy or restrict what users can run on a Free operating system. The Pluton firmware update mechanism appears to be OS mediated, so users who control their OS can simply choose not to opt in to that.

comment count unavailable comments

11 December 2022

Russ Allbery: Review: The Fifth Elephant

Review: The Fifth Elephant, by Terry Pratchett
Series: Discworld #24
Publisher: Harper
Copyright: 2000
Printing: May 2014
ISBN: 0-06-228013-9
Format: Mass market
Pages: 455
The Fifth Elephant is the 24th Discworld and fifth Watch novel, and largely assumes you know who the main characters are. This is not a good place to start. The dwarves are electing a new king. The resulting political conflict is spilling over into the streets of Ankh-Morpork, but that's not the primary problem. First, the replica Scone of Stone, a dwarven artifact used to crown the Low King of the Dwarves, is stolen from the Dwarf Bread Museum. Then, Vimes is dispatched to berwald, ostensibly to negotiate increased fat exports with the new dwarven king. And then Angua disappears, apparently headed towards her childhood home in berwald, which immediately prompts Carrot to resign and head after her. The City Watch is left in the hands of now-promoted Captain Colon. We see lots of Lady Sybil for the first time since Guards! Guards!, and there's a substantial secondary plot with Angua and Carrot and a tertiary plot with Colon making a complete mess of things back home, but this is mostly a Vimes novel. As usual, Vetinari is pushing him outside of his comfort zone, but he's not seriously expecting Vimes to act like an ambassador. He's expecting Vimes to act like a policeman, even though he's way outside his jurisdiction. This time, that means untangling a messy three-sided political situation involving the dwarves, the werewolves, and the vampires. There is some Igor dialogue in this book, but thankfully Pratchett toned it down a lot and it never started to bother me. I do enjoy Pratchett throwing Vimes and his suspicious morality at political problems and watching him go at them sideways. Vimes's definition of crimes is just broad enough to get him fully invested in a problem, but too narrow to give him much patience with the diplomatic maneuvering. It makes him an unpredictable diplomat in a clash of cultures way that's fun to read about. Cheery and Detritus are great traveling companions for this, since both of them also unsettle the dwarves in wildly different ways. I also have to admit that Pratchett is doing more interesting things with the Angua and Carrot relationship than I had feared. In previous books, I was getting tired of their lack of communication and wasn't buying the justifications for it, but I think I finally understand why the communication barriers are there. It's not that Angua refuses to talk to Carrot (although there's still a bit of that going on). It's that Carrot's attitude towards the world is very strange, and gets stranger the closer you are to him. Carrot has always been the character who is too earnest and straightforward and good for Ankh-Morpork and yet somehow makes it work, but Pratchett is doing something even more interesting with the concept of nobility. A sufficiently overwhelming level of heroic ethics becomes almost alien, so contrary to how people normally think that it can make conversations baffling. It's not that Carrot is perfect (sometimes he does very dumb things), it's that his natural behavior follows a set of ethics that humans like to pretend they follow but actually don't and never would entirely. His character should be a boring cliche or an over-the-top parody, and yet he isn't at all. But Carrot's part is mostly a side plot. Even more than Jingo, The Fifth Elephant is establishing Vimes as a force to be reckoned with, even if you take him outside his familiar city. He is in so many ways the opposite of Vetinari, and yet he's a tool that Vetinari is extremely good at using. Colon of course is a total disaster as the head of the Watch, and that's mostly because Colon should never be more than a sergeant, but it's also because even when he's taking the same action as Vimes, he's not doing it for the same reasons or with the same stubborn core of basic morality and loyalty that's under Vimes's suspicious conservatism. The characterization in the Watch novels doesn't seem that subtle or deep at first, but it accumulates over the course of the series in a way that I think is more effective than any of the other story strands. Vetinari, Vimes, and Carrot all represent "right," or at least order, in overlapping stories of right versus wrong, but they do so in radically different ways and with radically different goals. Each time one of them seems ascendant, each time one of their approaches seems more clearly correct, Pratchett throws them at a problem where a different approach is required. It's a great reading experience. This was one of the better Discworld novels even though I found the villains to be a bit tedious and stupid. Recommended. Followed by The Truth in publication order. The next Watch novel is Night Watch. Rating: 8 out of 10

10 December 2022

Simon Josefsson: Trisquel 11 on NV41PZ: First impressions

My NovaCustom NV41PZ laptop arrived a couple of days ago, and today I had some time to install it. You may want to read about my purchasing decision process first. I expected a rough ride to get it to work, given the number of people claiming that modern laptops can t run fully free operating systems. I first tried the Trisquel 10 live DVD and it booted fine including network, but the mouse trackpad did not work. Before investigating it, I noticed a forum thread about Trisquel 11 beta3 images, and being based on Ubuntu 22.04 LTS and has Linux-libre 5.15 it seemed better to start with more modern software. After installing through the live DVD successfully, I realized I didn t like MATE but wanted to keep using GNOME. I reverted back to installing a minimal environment through the netinst image, and manually installed GNOME (apt-get install gnome) since I prefer that over MATE, together with a bunch of other packages. I ve been running it for a couple of hours now, and here is a brief summary of the hardware components that works.
CPUAlder Lake Intel i7-1260P
Memory2x32GB Kingston DDR4 SODIMM 3200MHz
StorageSamsung 980 Pro 2TB NVME
BIOSDasharo Coreboot
GraphicsIntel Xe
Screen (internal)14 1920 1080
Screen (HDMI)Connected to Dell 27 2560 1440
Screen (USB-C)Connected to Dell 27 2560 1440 via Wavlink port extender
WebcamBuiltin 1MP Camera
MicrophoneIntel Alder Lake
KeyboardISO layout, all function keys working
MouseTrackpad, tap clicking and gestures
Ethernet RJ45Realtek RTL8111/8168/8411 with r8169 driver
Memory cardO2 Micro comes up as /dev/mmcblk0
Docking stationWavlink 4xUSB, 2xHDMI, DP, RJ45,
ConnectivityUSB-A, USB-C
AudioIntel Alder Lake
Hardware components and status
So what s not working? Unfortunately, NovaCustom does not offer any WiFi or Bluetooth module that is compatible with Trisquel, so the AX211 (1675x) Wifi/Bluetooth card in it is just dead weight. I imagine it would be possible to get the card to work if non-free firmware is loaded. I don t need Bluetooth right now, and use the Technoetic N-150 USB WiFi dongle when I m not connected to wired network. Compared against my X201, the following factors have improved. I m still unhappy about the following properties with both the NV41PZ and the X201. Hopefully my next laptop will have improved on this further. I hope to be able to resolve the WiFi part by replacing the WiFi module, there appears to be options available but I have not tested them on this laptop yet. Does anyone know of a combined WiFi and Bluetooth M.2 module that would work on Trisquel? While I haven t put the laptop to heavy testing yet, everything that I would expect a laptop to be able to do seems to work fine. Including writing this blog post!

Next.

Previous.